filereader - Removing the BOM character with Java -
this question has reply here:
byte order mark screws file reading in java 7 answersi trying read files using filereader , write them separate file. these files utf-8 encoded, unfortuantely of them still contain bom. relevant code tried this:
private final string utf8_bom = "\ufeff"; private string removeutf8bom(string s) { if (s.startswith(utf8_bom)) { s=s.replace(utf8_bom, ""); } homecoming s; } line=removeutf8bom(line);
but reason bom not removed. there other way can filereader? know there bominputstream should work, i'd rather find solution using filereader.
naive solution question asked: public static void main(final string[] args) { final string hasbom = "\ufeff" + "hello world!"; final string nobom = hasbom.charat(0) == '\ufeff' ? hasbom.substring(1) : hasbom; system.out.println(hasbom.equals(nobom)); }
outputs: false
proper solution approach:
you should never programme file
based api , instead programme against inputstream/outputstream
code portable different source locations.
this untested illustration of how might go encapsulating behavior inputstream
create transparent.
public class bomproofinputstream extends inputstream { private final inputstream is; public bomproofinputstream(@nonnull final inputstream is) { this.is = is; } private boolean isfirstbyte = true; @override public int read() throws ioexception { if (this.isfirstbyte) { this.isfirstbyte = false; final int b = is.read(); if ("\ufeff".charat(0) != b) { homecoming b; } } homecoming is.read(); } }
found full fledged example searching: java filereader byte-order-mark
No comments:
Post a Comment