Class TextFormat.Tokenizer

java.lang.Object
com.google.protobuf.TextFormat.Tokenizer
Enclosing class:
TextFormat

private static final class TextFormat.Tokenizer extends Object
Represents a stream of tokens parsed from a String.

The Java standard library provides many classes that you might think would be useful for implementing this, but aren't. For example:

  • java.io.StreamTokenizer: This almost does what we want -- or, at least, something that would get us close to what we want -- except for one fatal flaw: It automatically un-escapes strings using Java escape sequences, which do not include all the escape sequences we need to support (e.g. '\x').
  • java.util.Scanner: This seems like a great way at least to parse regular expressions out of a stream (so we wouldn't have to load the entire input into a single string before parsing). Sadly, Scanner requires that tokens be delimited with some delimiter. Thus, although the text "foo:" should parse to two tokens ("foo" and ":"), Scanner would recognize it only as a single token. Furthermore, Scanner provides no way to inspect the contents of delimiters, making it impossible to keep track of line and column numbers.
  • Field Details

    • text

      private final CharSequence text
    • currentToken

      private String currentToken
    • pos

      private int pos
    • line

      private int line
    • column

      private int column
    • lineInfoTrackingPos

      private int lineInfoTrackingPos
    • previousLine

      private int previousLine
    • previousColumn

      private int previousColumn
  • Constructor Details

    • Tokenizer

      private Tokenizer(CharSequence text)
      Construct a tokenizer that parses tokens from the given text.
  • Method Details