utf 8 - Read next character (full unicode code point) from Java input stream -
i need parse utf-8 input (from text file) character character (and character mean total utf-8 character (utf-8 code point), not java's char).
what approach should use?
since java 8 there's charsequence.codepoints()
for example:
// if want work line line, utilize files.readalllines() // if utilize guava, there's guava's files.tostring() reading whole file string byte[] bytes = files.readallbytes(paths.get("test.txt")); string text = new string(bytes, standardcharsets.utf_8); intstream codepoints = text.codepoints(); // code points codepoints.foreach(codepoint -> system.out.println(codepoint));
java utf-8
No comments:
Post a Comment