Tuesday, 15 January 2013

utf 8 - Read next character (full unicode code point) from Java input stream -



utf 8 - Read next character (full unicode code point) from Java input stream -

i need parse utf-8 input (from text file) character character (and character mean total utf-8 character (utf-8 code point), not java's char).

what approach should use?

since java 8 there's charsequence.codepoints()

for example:

// if want work line line, utilize files.readalllines() // if utilize guava, there's guava's files.tostring() reading whole file string byte[] bytes = files.readallbytes(paths.get("test.txt")); string text = new string(bytes, standardcharsets.utf_8); intstream codepoints = text.codepoints(); // code points codepoints.foreach(codepoint -> system.out.println(codepoint));

java utf-8

No comments:

Post a Comment