The fast scanner generator for Java™ with full Unicode support

Overview

Build Bazel build status

JFlex

JFlex is a lexical analyzer generator (also known as scanner generator) for Java.

JFlex takes as input a specification with a set of regular expressions and corresponding actions. It generates Java source of a lexer that reads input, matches the input against the regular expressions in the spec file, and runs the corresponding action if a regular expression matched. Lexers usually are the first front-end step in compilers, matching keywords, comments, operators, etc, and generating an input token stream for parsers.

JFlex lexers are based on deterministic finite automata (DFAs). They are fast, without expensive backtracking.

Usage

For documentation and more information see the JFlex documentation and the wiki.

Usage with Maven

Maven central

You need Maven 3.5.2 or later, and JDK 8 or later.

  1. Place grammar files in src/main/flex/ directory.

  2. Extend the project POM build section with the maven-jflex-plugin

  <build>
    <plugins>
      <plugin>
        <groupId>de.jflex</groupId>
        <artifactId>jflex-maven-plugin</artifactId>
        <version>1.8.2</version>
        <executions>
          <execution>
            <goals>
              <goal>generate</goal>
            </goals>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>
  1. Voilà: Java code is produced in target/generated-sources/ during the generate-sources phase (which happens before the compile phase) and included in the compilation scope.

Usage with ant

You need ant, the binary jflex jar and JDK 8 or later.

  1. Define ant task
<taskdef classname="jflex.anttask.JFlexTask" name="jflex"
         classpath="path-to-jflex.jar"/>
  1. Use it
<jflex file="src/grammar/parser.flex" destdir="build/generated/"/>
<javac srcdir="build/generated/" destdir="build/classes/"/>

Usage with Bazel

We provide a jflex rule

load("@jflex_rules//jflex:jflex.bzl", "jflex")

jflex(
    name = "",           # Choose a rule name
    srcs = [],           # Add input lex specifications
    outputs = [],        # List expected generated files
)

See the sample simple BUILD file.

Usage in CLI

You need the binary jflex jar and JDK 8 or later.

You can also use JFlex directly from the command line:

jflex/bin/jflex src/grammar/parser.flex

Or:

java -jar jflex-full-1.8.2.jar -d output src/grammar/parser.flex

Other build tools

See Build tool plugins.

Examples

Have a look at the sample project: simple and other examples.

Contributing

Javadoc

JFlex is free software, contributions are welcome. See the Contributing page for instructions.

Source layout

The top level directory of the JFLex git repository contains:

  • cup A copy of the CUP runtime
  • cup-maven-plugin A simple Maven plugin to generate a parser with CUP.
  • docs the Markdown sources for the user manual
  • java Java sources [WIP, Bazel]
  • javatests Java sources of test [WIP, Bazel]
  • jflex JFlex, the scanner/lexer generator for Java
  • jflex-maven-plugin the JFlex maven plugin, that helps to integrate JFlex in your project
  • jflex-unicode-plugin the JFlex unicode maven plugin, used for compiling JFlex
  • testsuite the regression test suite for JFlex,
  • third_party third-party librairies used by examples of the Bazel build system

Build from source

Build with Bazel

JFlex can be built with Bazel. Migration to Bazel is still work in progress, concerning the test suite, for instance.

You need Bazel.

bazel build //jflex:jflex_bin

This builds bazel-bin/jflex/jflex_bin, that you can use

bazel-bin/jflex/jflex_bin --info

Or:

bazel run //jflex:jflex_bin -- --info

Build uberjar (aka fatjar aka deploy jar)

bazel build jflex/jflex_bin_deploy.jar

Continuous integration is done with Cirrus CI.

Build with Maven

You need JDK 8 or later.

./mvnw install

This generates jflex/target/jflex-full-1.9.0-SNAPSHOT.jar that you can use, e.g.

java -jar jflex-full-1.9.0-SNAPSHOT.jar --info

Continuous Integration is made with Travis.

Issues
  • [Bug] Error in skeleton.nested [sf#132]

    [Bug] Error in skeleton.nested [sf#132]

    Reported by jeningar on 2014-09-23 10:37 UTC Hi Steve, hi Gerwin,

    using the skeleton.nested, the resulting program showed the following behaviour:

    echo "list sessions;" | sdmsh

    hangs forever. But

    echo "list sessions;" > /tmp/x sdmsh < /tmp/x

    teminates as should be.

    After merging the skeleton.default and skeleton.nested this erroneous behaviour vanished.

    Diffing the old and new skeleton.nested shows (< == new; > == old):

    [[email protected] shell]$ diff skeleton.nested* | more
    39c39
    < 
    
    ---
    >   
    169c169
    <     while (numRead == 0) { // bug #130 discussion; while is better than if
    
    ---
    >     if (numRead == 0) {
    403,409d402
    <     // cached fields:
    <     int zzCurrentPosL;
    <     int zzMarkedPosL;
    <     int zzEndReadL = zzEndRead;
    <     char [] zzBufferL = zzBuffer;
    <     char [] zzCMapL = ZZ_CMAP;
    < 
    413c406,411
    <       zzMarkedPosL = zzMarkedPos;
    
    ---
    >       // cached fields:
    >       int zzCurrentPosL;
    >       int zzMarkedPosL = zzMarkedPos;
    >       int zzEndReadL = zzEndRead;
    >       char [] zzBufferL = zzBuffer;
    >       char [] zzCMapL = ZZ_CMAP;
    

    The HUGE difference is that the variables (notably zzEndReadL) aren't initialized by skeleton.default in every iteration of the while loop.

    During the merge I wondered why there are two skeletons in the first place. Those who don't want to read from multiple streams just don't call yypushStream() and friends. One skeleton would be more than enough (and a link for backward compatibility).

    for the sake of completeness I attached the new skeleton.nested.

    Regards,

    Ronald

    bug 
    opened by lsf37 27
  • Circular dependencies on java_cup/javacup

    Circular dependencies on java_cup/javacup

    java_cup/javacup needs jflex to build. jflex needs java_cup/javacup to build. Both have circular dependencies on each other requiring bootstrap. No clean way to build either from source without using a pre-made binary. Ideally this should change and allow pieces to be built. Such that either can be built fully from source without requiring a pre-built jar. This must have been the case initially before either was made. Chicken and egg situation. Java seems to have lots of issues with such, jaxen/jdom, antlr/stringtemplate, and jflex/javacup. Of course the JDK itself, though open jdk can technically be built from source going back to 1.5 gcj and gnu-classpath.

    Hopefully that is possible here, some old version of either javacup or jflex does not need the other and can be the first step in a building from source solution. At the present time each is having to use binaries which is not preferred. Thank you for your consideration in addressing this circular dependency issue.

    question 
    opened by wltjr 25
  • [Bug] Re-enable scanning interactively or from a network byte stream [sf#130]

    [Bug] Re-enable scanning interactively or from a network byte stream [sf#130]

    Reported by jeningar on 2014-08-05 13:14 UTC Hi,

    We have an application that receives commands over a TCP/IP socket. After each command a reply is sent. Normally the communication is interactive, which means a strict order of questions and answers. This works perfectly with jflex 1.4.x. (The grammar is easy from the jflex point of view. Every command is terminated by a semicolon, no read ahead necessary; this follows your FAQ answer on "I want my scanner to read from a network byte stream or from interactive stdin. Can I do this with JFlex?").

    Now we have some problems with Jflex 1.6.0.

    Jflex generates following code in zzRefill():

    ...
        int requested = zzBuffer.length - zzEndRead;
        int totalRead = 0;
        while (totalRead < requested) {
          int numRead = zzReader.read(zzBuffer, zzEndRead + totalRead, requested - totalRead);
          if (numRead == -1) {
            break;
          }
          totalRead += numRead;
        }
    ...
    

    This code reads bytes from the input as long as the buffer isn't full or until EOF is reached. This is nice in case of reading a file, but leads to deadlocks if using an interactive scanner.

    The easiest way to avoid the deadlock is to eliminate the while loop. If the assumption that a read() returns at least one character (or EOF) isn't valid, the while loop must be executed as long as nothing is read. (I tested this; it seems to work flawlessly).

    Since I assume there was a good reason for this piece of code, I'd like to have a command line option to eliminate the while loop here. I'd be delighted if someone would explain me why a repeated call of zzRefill() is worse than the while loop. It can't really be a performance issue. The costs of an I/O exceed by far the costs of a function call and a bit of basic arithmetic.

    Regards,

    Ronald

    bug 
    opened by lsf37 24
  • %unicode 2.0 lexers throw IOOBE on input with surrogate chars

    %unicode 2.0 lexers throw IOOBE on input with surrogate chars

    Default %unicode means 10x times static memory footprint (see zzUnpackCMap). Switching to 2.0 leaves Character.codePointAt() generation intact leading to:

    java.lang.ArrayIndexOutOfBoundsException: 65536

    in advance() method.

    bug enhancement 
    opened by gregsh 18
  • [Bug] Lexers don't work anymore after migration from JFlex 1.4 to JFlex >= 1.5 (infinite loops) [sf#134]

    [Bug] Lexers don't work anymore after migration from JFlex 1.4 to JFlex >= 1.5 (infinite loops) [sf#134]

    Reported by thierryblind on 2015-01-13 23:42 UTC Hello, I'm a committer for the eclipse PDT project (https://eclipse.org/pdt/) and I'm actually trying to migrate all the lexers (using JFlex 1.4) to use JFlex >= 1.5 Sadly I'm now facing infinite loops because some lexers don't seem to behave the same anymore. I added as attachment a test project that contains some of the problematic lexers (in folder "parserTools"), the generated JFlex 14.3 and 1.5.1 classes, and a test case (src/launch/Tests.java). Could you please have a look and tell me what's happening? Thank you very much for your help,

    Thierry.

    bug 
    opened by lsf37 15
  • Generate UnicodeProperties with Bazel

    Generate UnicodeProperties with Bazel

    The jflex-unicode-maven-plugin is slow and fetches data from unicode.org which is also not very reliable. As a result, the generated code has been checked in. This is bad practice, and we have been modifying these generated files.

    This is the first step in an attempt to replace the jflex-unicode-maven-plugin by Bazel:

    1. Bazel fetches and caches all resources. It can use mirrors. See #522
    2. With, this change, only UnicodeProperties is re-generated, using versions given in the command-line.
    3. Instead of using a custom "skeleton", I'm using Apache Velocity template engine. See #520
    4. I've rewritten more than I wanted because the URL was too much part of the previous model. See DataFileType.java

    Effective changes in generated file:

    • Set default version to 9.0.0 instead of 9.0. This should be a no-op.
    • Use switch/case rather than chain of ifs
    • Bump unicode 3.1.0 to 3.1.1
    opened by regisd 15
  • [Bug] unexpected Error: could not match input [sf#107]

    [Bug] unexpected Error: could not match input [sf#107]

    Reported by kneunert on 2010-03-01 20:56 UTC I'm using a bit of a trick in the Lexer like this:

    <YYINITIAL> {identifier}[ { yybegin(ARRAY1); return someSymbol; } <ARRAY1> $? { yybegin(ARRAY2); return someSymbol; } <ARRAY2> $? { yybegin(ARRAY3); return someSymbol; } <ARRAY3> $? { yybegin(YYINITIAL); return someSymbol; } <YYINITIAL> ] { yybegin(YYINITIAL); return someSymbol; }

    The trick here is, that i use an unconvential optional character. This Character is not there, so no character gets consumed however a series of symbols are returned. This used to work in JLex and it does not seem to work in JFlex anymore. I get this:

    Symbol: [ Exception in thread "main" java.lang.Error: Error: could not match input at struktor.processor.Yylex.zzScanError(Yylex.java:439) at struktor.processor.Yylex.next_token(Yylex.java:590) at struktor.processor.MyMain.main(MyMain.java:19)

    I have a simple testcase for this. If needed, i can attach it to this ticket.

    Thanks

    Kim ( https://sourceforge.net/projects/struktor/ )

    bug 
    opened by lsf37 15
  • [Bug] buffer expansion bug in yy_refill()? [sf#60]

    [Bug] buffer expansion bug in yy_refill()? [sf#60]

    Reported by smagoun on 2004-01-30 20:35 UTC yy_refill() in skeleton.default and skeleton.nested seems to have a problem expanding the buffer correctly. The bug manifests itself when reading a lot of data at once. I ran into this using the Piccolo XML parser, which uses JFlex to parse XML. Piccolo died while reading a very long CDATA element in the XML. I tracked it to yy_refill(), which seems to have been copied from one of the skeleton files JFlex ships with.

    The problem is that the buffer never expands properly when reading long input, which results in an ArrayIndexOutOfBoundsException. The following patch fixes Piccolo; I'm not sure if it applies to JFlex, but I'm guessing it might.

    (I'm not convinced that the if() should check yy_currentPos>=buffer.length at all, but it seems harmless)

    --- PiccoloLexer.java   Sun Jul  7 14:21:18 2002
    +++ PiccoloLexer copy.java      Fri Jan 30 15:07:44 2004
    @@ -3291,9 +3291,10 @@
    }
    
    /* is the buffer big enough? */
    -    if (yy_currentPos &gt;= yy_buffer.length) {
    +    if (yy_currentPos &gt;= yy_buffer.length)
    +        || yy_markedPos &gt;= yy_buffer.length) {
    /* if not: blow it up */
    -      char newBuffer[] = new char[yy_currentPos*2];
    +      char newBuffer[] = new char[yy_buffer.length*2];
    System.arraycopy(yy_buffer, 0, newBuffer, 0, 
    yy_buffer.length);
    yy_buffer = newBuffer;
    }
    
    bug 
    opened by lsf37 15
  • %eof{ ... %eof} is not being included when trying to upgrade to v1.8.1 from 1.7.0

    %eof{ ... %eof} is not being included when trying to upgrade to v1.8.1 from 1.7.0

    Hello,

    After trying an upgrade of OpenGrok to JFlex v1.8.1, I found that code specified in %eof{ is not included anymore in the generated lexers. I browsed through the JFlex manual, but I didn't see anything to indicate the handling would have changed.

    (I did check but didn't see any other %eof open issues here.)

    Please any advice?

    Thank you.

    bug 
    opened by idodeclare 14
  • error: orphaned default and error: 'else' without 'if'

    error: orphaned default and error: 'else' without 'if'

    Odd generated output with error: orphaned default and error: 'else' without 'if'.

    qdox 1.12.1 builds with 1.4.3, but not 1.6.1 bootstrapped qdox 2 builds with 1.4.3 and 1.6.1 bootstrapped jflex 1.6.1 builds with 1.6.1 bootstrapped but not 1.4.3

    The errors are the same when jflex 1.6.1 bootstrapped fails on qdox 1.12.1, and when jflex 1.4.3 fails on jflex 1.6.1. That is odd and I cannot explain.

    qdox 1.12.1 under 1.6.1 bootstrapped via binary fails

     * Compiling ...
    src/java/JFlexLexer.java:1357: error: orphaned default
            default:
            ^
    src/java/JFlexLexer.java:2051: error: 'else' without 'if'
              else {
              ^
    2 errors
     * ERROR: dev-java/qdox-1.12.1-r10::os-xtoo failed (compile phase):
    

    jflex 1.6.1 under 1.4.3 fails, under 1.6.1 bootstrapped via binary it builds fine

    Writing code to "java/LexScan.java"
     * Compiling ...
    src/main/java/LexScan.java:3697: error: orphaned default
              default:
              ^
    src/main/java/LexScan.java:3626: error: 'else' without 'if'
          else {
          ^
    2 errors
    

    I have 1.4.3 built and running under Java 9. I do not believe the jdk version has anything to do with the generated output with errors. I think that has more to do with syntax or something.

    invalid 
    opened by wltjr 14
  • Bump ant from 1.10.9 to 1.10.11

    Bump ant from 1.10.9 to 1.10.11

    Bumps ant from 1.10.9 to 1.10.11.

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 0
  • Bump commons-io from 2.8.0 to 2.11.0

    Bump commons-io from 2.8.0 to 2.11.0

    Bumps commons-io from 2.8.0 to 2.11.0.

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 0
  • Bump maven.version from 3.5.2 to 3.8.1

    Bump maven.version from 3.5.2 to 3.8.1

    Bumps maven.version from 3.5.2 to 3.8.1. Updates maven-plugin-api from 3.5.2 to 3.8.1

    Commits
    • 05c21c6 [maven-release-plugin] prepare release maven-3.8.1
    • d295dc3 [MNG-7128] keep blocked attribute from mirrors in artifact repositories
    • a469068 next version in branch 3.8.x is 3.8.1-SNAPSHOT
    • dad8a3e [maven-release-plugin] prepare for next development iteration
    • 6aa1f4a [maven-release-plugin] prepare release maven-3.8.0
    • 907d53a [MNG-7118] block HTTP repositories by default
    • 899465a [MNG-7117] add support for blocked mirror
    • fa79cb2 [MNG-7116] add support for mirrorOf external:http:*
    • e5f6634 use Maven Resolver 1.6.2
    • 09f77da [MNG-7119] Upgrade Maven Wagon to 3.4.3
    • Additional commits viewable in compare view

    Updates maven-compat from 3.5.2 to 3.8.1

    Commits
    • 05c21c6 [maven-release-plugin] prepare release maven-3.8.1
    • d295dc3 [MNG-7128] keep blocked attribute from mirrors in artifact repositories
    • a469068 next version in branch 3.8.x is 3.8.1-SNAPSHOT
    • dad8a3e [maven-release-plugin] prepare for next development iteration
    • 6aa1f4a [maven-release-plugin] prepare release maven-3.8.0
    • 907d53a [MNG-7118] block HTTP repositories by default
    • 899465a [MNG-7117] add support for blocked mirror
    • fa79cb2 [MNG-7116] add support for mirrorOf external:http:*
    • e5f6634 use Maven Resolver 1.6.2
    • 09f77da [MNG-7119] Upgrade Maven Wagon to 3.4.3
    • Additional commits viewable in compare view

    Updates maven-core from 3.5.2 to 3.8.1

    Commits
    • 05c21c6 [maven-release-plugin] prepare release maven-3.8.1
    • d295dc3 [MNG-7128] keep blocked attribute from mirrors in artifact repositories
    • a469068 next version in branch 3.8.x is 3.8.1-SNAPSHOT
    • dad8a3e [maven-release-plugin] prepare for next development iteration
    • 6aa1f4a [maven-release-plugin] prepare release maven-3.8.0
    • 907d53a [MNG-7118] block HTTP repositories by default
    • 899465a [MNG-7117] add support for blocked mirror
    • fa79cb2 [MNG-7116] add support for mirrorOf external:http:*
    • e5f6634 use Maven Resolver 1.6.2
    • 09f77da [MNG-7119] Upgrade Maven Wagon to 3.4.3
    • Additional commits viewable in compare view

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 0
  • Bump cup.version from 11b-20160615 to 11b-20160615-1

    Bump cup.version from 11b-20160615 to 11b-20160615-1

    Bumps cup.version from 11b-20160615 to 11b-20160615-1. Updates java-cup from 11b-20160615 to 11b-20160615-1

    Release notes

    Sourced from java-cup's releases.

    11b-20160615-1

    Bugfix: vbmacher/cup-maven-plugin#12

    Commits

    Updates java-cup-runtime from 11b-20160615 to 11b-20160615-1

    Release notes

    Sourced from java-cup-runtime's releases.

    11b-20160615-1

    Bugfix: vbmacher/cup-maven-plugin#12

    Commits

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 0
  • EOF lookahead?

    EOF lookahead?

    Are there any plans to do lookahead that can see EOF? There's a bug in one of Stanford NLP group's tokenization tools which would be easily fixed by such an ability.

    https://github.com/stanfordnlp/CoreNLP/issues/1161

    The fundamental issue is that our rule to tokenize "gonna" etc looks like this:

    {ASSIMILATIONS2}/[^\p{Alpha}]
    

    so basically it's taking words like "gonna" that aren't followed by more text without whitespace, such as "i'm gonnaeatallmycopiesofmoxopaloutofanger"

    Years ago I made a similar request and was told that the easiest solution would be to add an extra whitespace to the end of the text being processed, but I'm hoping something more elegant than that will be available soon. Thanks!

    https://sourceforge.net/p/jflex/mailman/jflex-users/thread/CAHaU7mb%3DV98ApE80B%2BGyBUf8%2BsPg4KOenqY3O%3DXbkvONpH2utA%40mail.gmail.com/#msg32027415

    opened by AngledLuffa 0
  • Disable fmt-maven-plugin plugin Below Java 11

    Disable fmt-maven-plugin plugin Below Java 11

    Semmle LGTM analysis is currently broken with

    [autobuild] java.lang.UnsupportedClassVersionError: com/google/googlejavaformat/java/FormatterException has been compiled by a more recent version of the Java Runtime (class file version 55.0), this version of the Java Runtime only recognizes class file versions up to 52.0 at java.lang.ClassLoader.defineClass1 (Native Method)

    Probably following #909

    This PR enables the profile on Java 11+, as required by the plugin.

    opened by regisd 0
  • Unexpected exception encountered in JFlex: Not normalised type = STAR

    Unexpected exception encountered in JFlex: Not normalised type = STAR

    I received this exception while trying to run JFlex to create a Scheme lexical analyzer for my Programming Languages course. I have never done this before, and I had never used JFlex before so this is probably a mistake I made. I am still following the instructions to try to get some help. Any help would be highly appreciated. Thanks in advance!

    "Reading "Scheme.jflex"

    Unexpected exception encountered. This indicates a bug in JFlex. Please consider filing an issue at http://github.com/jflex-de/jflex/issues/new

    Not normalised type = STAR content : type = PRIMCLASS content : { [' ']['"']['('-')']['']['^']['|'] } jflex.exceptions.CharClassException: Not normalised type = STAR content : type = PRIMCLASS content : { [' ']['"']['('-')']['']['^']['|'] } at jflex.core.RegExp.checkPrimClass(RegExp.java:242) at jflex.core.RegExp.normalise(RegExp.java:323) at jflex.core.RegExp.normalise(RegExp.java:353) at jflex.core.RegExps.normalise(RegExps.java:293) at jflex.core.LexParse$CUP$LexParse$actions.CUP$LexParse$do_action_part00000000(LexParse.java:1029) at jflex.core.LexParse$CUP$LexParse$actions.CUP$LexParse$do_action(LexParse.java:2257) at jflex.core.LexParse.do_action(LexParse.java:598) at java_cup.runtime.lr_parser.parse(lr_parser.java:699) at jflex.generator.LexGenerator.generate(LexGenerator.java:74) at jflex.Main.generate(Main.java:320) at jflex.Main.main(Main.java:336)"

    opened by AmyD97 8
  • Make quiet logging actually quiet

    Make quiet logging actually quiet

    Currently, jflex has options

    • --verbose which enables Out.println
    • --quiet which negates --verbose

    Options.verbose and Options.quiet are always the opposite of the other, except in the JflexTestRunner.

    This is very confusing and this PR refactors Options and logging:

    • Replace the Options.verbose and Options.quiet by Options.logLevel
    • To match with current behaviour
      • the log level is WARNING by default
      • --verbose increases verbosity to Level.INFO
    • Contrary to current behaviour --quiet will reduce verbosity to Level.SEVERE.
      • as needed by the JflexTestRunner
      • as wished by Bazel rule
    code quality 
    opened by regisd 0
  • Run google-java-format by a github action.

    Run google-java-format by a github action.

    code quality 
    opened by regisd 0
  • Why does not Unicode_3_2 have propertyValueAliases for key blk=alphabeticpresentationforms?

    Why does not Unicode_3_2 have propertyValueAliases for key blk=alphabeticpresentationforms?

    While re-generating the Unicode data with the ucd_generator (#828), I had an unexpected output with Unicode 3.2.0

    • propertyValueAliases: Not true that <{ahex=asciihexdigit, alpha=alphabetic, arab=arabic, etc.
    • is equal to <{ahex=asciihexdigit, alpha=alphabetic, arab=arabic, etc.
    • The subject has the following extra entries: {blk=alphabeticpresentationforms=block=alphabeticpresentationforms, blk=arabic=block=arabic, etc.

    @sarowe Why does the current Unicode_3_2.java not contain the block blk value aliases?

    The value block=alphabeticpresentationforms is used in the interval. L3928

    And the alias is defined in PropertyAliases-3.2.0.txt

    # Non-enumerated Properties
    # ================================================
    age       ; Age
    blk       ; Block
    

    Shouldn't we emit the alias in that case?

    question 
    opened by regisd 0
Releases(v1.8.2)
  • v1.8.2(May 3, 2020)

  • v1.8.1(Feb 28, 2020)

    JFlex 1.8.1 is a small maintenance release. There are no new features or bug fixes. The only change is

    • in dependency management for the CUP parser generator and runtime to re-enable building from source in the release package (#734)

    More detailed list of changes in milestone 1.8.1

    Source code(tar.gz)
    Source code(zip)
    jflex-1.8.1.tar.gz(4.69 MB)
    jflex-1.8.1.tar.gz.asc(833 bytes)
    jflex-1.8.1.tar.gz.sha1(61 bytes)
    jflex-1.8.1.zip(4.81 MB)
    jflex-1.8.1.zip.asc(833 bytes)
    jflex-1.8.1.zip.sha1(58 bytes)
    manual.pdf(438.99 KB)
  • v1.8.0(Feb 26, 2020)

    • yychar type has been changed from int to long in order to support large files (> 2GB) (#605)
    • Add @SuppressWarnings("FallThrough") on generated lexer #454
    • Defend against spoon-feeding readers not fully populating the scanning buffer #543
    • Add support for Unicode 10.0 #540 11.0 #555 12.0 #556 and 12.1 #563
    • Unicode Emoji properties are supported for Unicode versions 8.0+ (#546)
    • Significantly decreased memory usage for unicode scanners from ~4MB to typical ~20kB. (#697)
    • Macro expressions in character classes are now allowed (#216, #654)
    • Expose yyatEOF() in generated scanner API (#644)
    • Pipe action | now works for <<EOF>> (#201)
    • Explicitly use UTF-8 encoding for skeleton files and dot files (#470)
    • Maven plugin now correctly checks #include file time stamp (#694)
    • Slightly optimised character classes when ^ operator is used (#682)
    • Normalised character class order. This has no influence on how text is matched, but makes --dump output more comparable. (#650)
    • Fixed a bug in the negation ! operator that in rare circumstances would match not everything covered by the negation (#567).
    • The . expression now does not match unpaired surrogates, since these are not characters. (#544)
    • Example specs now with build for ant, make, and maven
    • Introduced a code LexGenerator API. #428 #448
    • Add the jflex source in generated code #371 #399
    • Code cleanup
      • modularisation effort
      • Removed dead code class CharSet #480
      • Use @AutoValue #505
      • Fixed PMD violations #413 #418
      • Use Truth in tests #365 #660
      • Replace commons-io by guava #319
    • Dep updates
      • Updated maven dependencies #409
      • Updated the Maven wrapper to 0.4.2 #382
    • Build system
      • retired ant build #432
      • now supporting Bazel build

    See all changes in milestone 1.8.0

    Source code(tar.gz)
    Source code(zip)
    jflex-1.8.0.tar.gz(4.69 MB)
    jflex-1.8.0.tar.gz.asc(833 bytes)
    jflex-1.8.0.tar.gz.sha1(61 bytes)
    jflex-1.8.0.zip(4.81 MB)
    jflex-1.8.0.zip.asc(833 bytes)
    jflex-1.8.0.zip.zip.sha1(58 bytes)
    manual.pdf(439.19 KB)
  • untagged-7b543e39f8064f99874f(Oct 10, 2018)

  • v1.7.0(Sep 21, 2018)

    • Prerequisites
      • Compilation requires jdk7 and Maven 3.5.2
      • Execution requires jdk7 and Maven 3.0
      • Compilation of generated code requires jdk 5
    • CUP upgraded to 0.11b
    • Option --inputstreamctor has been removed (#195)
    • Code health
      • Codebase has valid doclint (#206)
      • Maven plugins update to use Java annotations rather than javadoc at-clauses.
    • jflex --version or --info or --help now exits with error code 0 (#194)
    • Unicode 8.0 and 9.0 are supported (#209)
    • documentation improvements (#152, #187, #215, #290)
    • added an --encoding option to specify input/output encoding (#164)
    • make jflex start script robust for other locales (#251)
    • report character position when %debug and %char are present (#207)

    See https://github.com/jflex-de/jflex/milestone/10

    Source code(tar.gz)
    Source code(zip)
    jflex-1.7.0.tar.gz(3.48 MB)
    jflex-1.7.0.tar.gz.asc(833 bytes)
    jflex-1.7.0.tar.gz.sha1(61 bytes)
    jflex-1.7.0.zip(3.55 MB)
    jflex-1.7.0.zip.asc(833 bytes)
    jflex-1.7.0.zip.sha1(58 bytes)
    jflex-full-1.7.0.jar(1.23 MB)
    jflex-full-1.7.0.jar.asc(854 bytes)
    manual.pdf(455.54 KB)
  • release_1_4(Nov 7, 2017)

    Released 2004-04-12

    • new, very fast minimization algorithm (also fixes memory issues)
    • new --jlex option for strict compatibility to JLex. Currently it changes %ignorecase to JLex semantics, that is, character classes are interpreted in a caseless way, too.
      (fixes bus #59, %ignorecase ignored by char classes). Thanks to Edward D. Willink for spotting the incompatibility.
    • support for even larger scanners (up to 64K DFA states). Thanks to Karin Vespoor.
    • removed eclipse compiler warnings for generated classes (feature request #144)
    • implemented faster character classes (feature request #143). Expressions like [a-z] | [A-Z] are interpreted as one atomic class [a-zA-Z], reducing NFA states and generation time significantly for some specifications. This affects the generation process only, generated scanners remain the same.
    • new %apiprivate switch (feature request #141/1) that causes all generated and skeleton methods to be made private. Exceptions to this are user defined functions and the constructor. Thanks to Stephen Ostermiller for the suggestion.
    • allow user defined javadoc class comments (feature request #141/2) If the user code section ends with a javadoc comment, JFlex takes this instead of the generated comment. Thanks to Stephen Ostermiller for the suggestion.
    • fixed bug #50 (undefined macros in complement expressions do not throw exception in generator). Thanks to Stephen Ostermiller for the bug report.
    • fixed bug #51 (yypushStream/yypopStream in skeleton.nested work as advertised)
    • fixed bug #57 (no wrong macro warnings on regexp negation)
    • fixed bug #58 (%cupsym now also affects %cupdebug) Thanks to Eric Schweitz for the fix.
    • fixed bug #52 (single-line %initthrow works now in case of extra whitespace before newline)
    • yyreset() does no longer close the associated reader (use yyclose() explicitly for that). Makes some reader objects reusable (feature request #140). Thanks to Stephen Ostermiller for the suggestion.
    • fixed modifier order in generated code, removes jikes compiler warnings Thanks to Michael Wildpaner for the fix.
    • ant task now also works with ant >= 1.4 (fixes bug #54)
    • yyreset() does not declare an execption any more (fixes bug #65)
    • %cup does not include %eofclose in JLex mode (--jlex). (Fixes bug #63)
    • optional parameter to %eofclose: "%eofclose false" turns off %eofclose if it was turned on previously (e.g. by %cup). (Fixes bug #63)
    • jflex build script switched to ant
    • internal: central Options class for better integration with build tools and IDEs
    • internal: change naming scheme for generated internal variables from yy_ to zz to comply with Java naming standard. Thanks to Max Gilead for the patch.
    Source code(tar.gz)
    Source code(zip)
  • release_1_4_1(Nov 7, 2017)

    Released 2004-11-07

    • merged in patch by Don Brown (fixes #70 Uses Old JUnit method assertFalse)
    • merged in patch by Don Brown (fixes #62 buffer expansion bug in yy_refill()) Thanks to Binesh Bannerjee for providing a simpler test case for this problem.
    • fixed bug #69 (ArrayIndexOutOfBounds in IntCharSet)
    • fixed bug #68 (Cannot use lookahead with ignorecase)
    • converted dangerous lookahead error to warning
    • print info for EOF actions as well in %debug mode
    • fixed line number count for EOF actions
    • internal: removed unused methods in LexScan.flex and IntCharSet
    Source code(tar.gz)
    Source code(zip)
  • release_1_4_2(Nov 7, 2017)

    Released 2008-05-28

    • implemented feature request #75: Now supports generics syntax for %type, %extends, etc
    • implemented feature request #156: Provided %ctorarg option to add arguments to constructor
    • fixed bug #80 (Reader.read might return 0)
    • fixed bug #57 (Ambiguous error message in macro expansion)
    • fixed bug #89 (Syntax error in input may cause NullPointerException)
    • fixed bug #85 (Need to defend against path blanks in jflex bash script)
    • fixed bug #82 (EOF actions may be ignored for same lex state)
    • fixed bug #81 (syntax error in generated ZZ_CMAP)
    • fixed bug #77 (lookahead and "|" actions)
    • fixed bug #74 (yytext() longer than expected with lookahead)
    • fixed bug #73 (OS/2 Java 1.1.8 Issues)
    • fixed bug #40 (dangerous lookahead check may fail)
    Source code(tar.gz)
    Source code(zip)
  • v1.5.0(Nov 29, 2019)

    Released 2014-03-23

    • the "switch" and "table" code generation options are deprecated and will be removed in JFlex 1.6
    • the JFlex license has been changed from GPL to BSD.
    • updated JFlex to CUP version 0.11a.
    • changed the build from Ant to Maven. 523d7a9
    • JFlex now mostly conforms with Unicode Regular Expressions UTS#18 Basic Unicode Support - Level 1. Supplementary code points (above the Basic Multilingual Plane) are not yet supported.
    • new meta characters supported: \s, \S, \d, \D, \w, \W.
    • nested character sets now supported, e.g. [[[ABC]D]E[FG]]
    • new character set operations supported: union (e.g. [A||B]), intersection (e.g. [A&&B]), set-difference (e.g. [A--B]), and symmetric difference (e.g. [A~~B]).
    • the meaning of the dot (".") meta character has been changed from [^\n] to [^\n\r\u000B\u000C\u0085\u2028\u2029]. Use the new --legacydot option to cause "." to be interpreted as [^\n].
    • new \R meta character matches any newline: "\r\n" | [\n\r\u000B\u000C\u0085\u2028\u2029].
    • new option --noinputstreamctor to not include an InputStream constructor in the generated scanner.
    • %include can now be used in the rules section (#117)
    • yychar and zzAtBOL should be reset for nested input streams (#107 & #108 )
    • fixed bug #109 (could not match input for empty string matches.)
    • fixed bug #112 & #119 (properly update zzFin when reallocating zzBuffer)
    • fixed bug #115 (noncompileable scanner generation when default locale is Turkish)
    • fixed bug #114 (zzEOFDone not included with pushed nested stream state)
    • fixed bug #105 (can't build examples/java/)
    • fixed bug #106 (impossible char class range should trigger syntax error)
    Source code(tar.gz)
    Source code(zip)
  • release_1_4_3(Nov 7, 2017)

    Released 2009-01-31

    • fixed bug #100 (lookahead syntax error)
    • fixed bug #97 (min_int in Java example scanner)
    • fixed bug #96 (zzEOFDone not reset in yyreset(Reader))
    • fixed bug #95 (%type and %int at the same time should produce error msg)
    Source code(tar.gz)
    Source code(zip)
  • release_1_5_1(Nov 7, 2017)

    • fixed problem calling ./jflex start scripts (#127)
    • corrected documentation flaws (#126)
    • further documentation and website updates
    • JFlex now reports the correct version string
    • added support for CUP2 with %cup2 switch, based on patch by Andreas Wenger
    Source code(tar.gz)
    Source code(zip)
  • v1.6.0(Nov 9, 2017)

    Released 2014-06-21

    • Unicode 7.0 is supported.
    • In %unicode mode, supplementary code points are now handled properly.
      • Regular expressions are now code-point based, rather than code-unit/ char based.
      • Input streams are read as code point sequences - properly paired surrogate code units are read as a single character.
      • All supported Unicode properties now match supplementary characters when Unicode 3.0 or above is specified, or when no version is specified, causing the default Unicode version, Unicode 7.0 in this release, to be used.
    • New \u{...} escape sequence allows code points (and whitespace-separated sequences of code points) to be specified as 1-6 hexadecimal digit values.
    • Characters in matches printed in %debug mode are now Unicode escaped (\uXXXX) when they are outside the range 32..127.
    • detect javadoc class comment when followed by annotation(s) (#128)
    • removed the "switch" and "table" code generation options
    • Option --noinputstreamctor deprecated. By default no InputStream constructor is included in the generated scanner. The capability to include one is deprecated and will be removed in JFlex 1.7.
    Source code(tar.gz)
    Source code(zip)
  • v1.6.1(Nov 9, 2017)

    Released 2015-03-16

    1.6.1 is a maintenance release, fixing all known defects.

    Changelog:

    • JFlex development, wiki, and issue tracker moved to https://github.com/jflex-de/
    • Fixed issue #130, "in caseless mode, chars in regexps not accepted caselessly": Caseless option works again as intended.
    • Fixed issue #131, "re-enable scanning interactively or from a network byte stream": JFlex now throws an IOException when a Reader returns 0 characters.
    • New example, shows how to deal with Readers that return 0 characters.
    • Command line scripts work again in repository version (contributed by Emma Strubell)
    • New options --warn-unused and --no-warn-unused that control warnings about unused macros.
    • Fixed issue #125: %apiprivate and %cup2 switches now no longer incompatible
    • Fix issue #133, "Error in skeleton.nested": Empty-string matches were taking precedence over EOF and caused non-termination. Now EOF is counted as the highest-priority empty match.
    • New warning when an expression matches the empty string (can lead to non-termination).
    Source code(tar.gz)
    Source code(zip)
    jflex-1.6.1.tar.gz(2.88 MB)
    jflex-1.6.1.zip(2.96 MB)
    jflex-maven-plugin-1.6.1.tar.gz(13.92 KB)
    jflex-maven-plugin-1.6.1.zip(31.51 KB)
Owner
JFlex
The fast scanner generator for Java
JFlex
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

ANTLR v4 Build status ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating

Antlr Project 10.6k Sep 15, 2021
A simple hierarchical state machine compiler that generates C.

Makina is a hierarchical state machine source-to-source translator. It takes state machine descriptions as input and produces C language implementations of those state machines.

Colin Holzman 112 Aug 6, 2021