ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

Overview

ANTLR v4

Java 7+ License

Build status

Github CI Build Status (MacOSX) AppVeyor CI Build Status (Windows) Circle CI Build Status (Linux) Travis-CI Build Status (Swift-Linux)

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files. It's widely used to build languages, tools, and frameworks. From a grammar, ANTLR generates a parser that can build parse trees and also generates a listener interface (or visitor) that makes it easy to respond to the recognition of phrases of interest.

Given day-job constraints, my time working on this project is limited so I'll have to focus first on fixing bugs rather than changing/improving the feature set. Likely I'll do it in bursts every few months. Please do not be offended if your bug or pull request does not yield a response! --parrt

Donate

Authors and major contributors

Useful information

You might also find the following pages useful, particularly if you want to mess around with the various target languages.

The Definitive ANTLR 4 Reference

Programmers run into parsing problems all the time. Whether it’s a data format like JSON, a network protocol like SMTP, a server configuration file for Apache, a PostScript/PDF file, or a simple spreadsheet macro language—ANTLR v4 and this book will demystify the process. ANTLR v4 has been rewritten from scratch to make it easier than ever to build parsers and the language applications built on top. This completely rewritten new edition of the bestselling Definitive ANTLR Reference shows you how to take advantage of these new features.

You can buy the book The Definitive ANTLR 4 Reference at amazon or an electronic version at the publisher's site.

You will find the Book source code useful.

Additional grammars

This repository is a collection of grammars without actions where the root directory name is the all-lowercase name of the language parsed by the grammar. For example, java, cpp, csharp, c, etc...

Issues
  • New extended Unicode escape \u{10ABCD} to support Unicode literals > U+FFFF

    New extended Unicode escape \u{10ABCD} to support Unicode literals > U+FFFF

    Fixes #276 .

    This used to be a WIP PR, but it's now ready for review.

    This PR introduces a new extended Unicode escape \u{10ABCD} in ANTLR4 grammars to support Unicode literal values > U+FFFF.

    The serialized ATN represents any atom or range with a Unicode value > U+FFFF as a set. Any such set is serialized in the ATN with 32-bit arguments.

    I bumped the UUID, since this changes the serialized ATN format.

    I included lots of tests and made sure everything is passing on Linux, Mac, and Windows.

    type:feature unicode 
    opened by bhamiltoncx 115
  • splitting version numbers for targets

    splitting version numbers for targets

    Hiya: @pboyer, @mike-lischke, @janyou, @ewanmellor, @hanjoes, @ericvergnaud, @lingyv-li, @marcospassos

    Eric has raised the point that it would be nice to be able to make quick patches to the various runtimes; e.g., there is a stopping bug now in the JavaScript target. He proposes something along these lines:

    • any change in the tool or the runtime algorithm bumps the middle version #: 4.9 -> 4.10 -> 4.11
    • any bug fix in a runtime we bump the last digit of that runtime only: 4.9 -> 4.9.1 -> 4.9.2
    • if bumping the java runtime for bug fix we also bump the tool since it contains the runtime

    This is in optimal as people have criticized me in the past for bumping, say, 4.6 to 4.7 for some minor changes. It also has the problem that 4.9.x will not mean the same thing in two different targets possibly, as each target will now have their own version number.

    Rather than break up all of the targets into separate repositories or similar, can you guys think of a better solution? Any suggestions? The goal here is to allow more rapid target releases, and independent of me having to do a major release of the tool.

    type:question 
    opened by parrt 94
  • Improve memory usage and perf of CodePointCharStream: Use 8-bit, 16-bit, or 32-bit buffer

    Improve memory usage and perf of CodePointCharStream: Use 8-bit, 16-bit, or 32-bit buffer

    This greatly improves the memory usage and performance of CodePointCharStream by ensuring the internal storage uses either an 8-bit buffer (for Unicode code points <= U+00FF), 16-bit buffer (for Unicode code points <= U+FFFF), or a 32-bit buffer (Unicode code points > U+FFFF).

    I split out the internal storage into a class CodePointBuffer which has a CodePointBuffer.Builder class which has the logic to upgrade from 8-bit to 16-bit to 32-bit storage.

    I found the perf hotspot in CodePointCharStream on master was the virtual method calls from CharStream.LA(offset) into IntBuffer.

    Refactoring it into CodePointBuffer didn't help (in fact, it added another virtual method call).

    To fix the perf, I made CodePointCharStream an abstract class and made three concrete subclasses: CodePoint8BitCharStream, CodePoint16BitCharStream, and CodePoint32BitCharStream which directly access the array of underlying code points in the CodePointBuffer without virtual method calls.

    lexers target:java comp:performance 
    opened by bhamiltoncx 85
  • initial discussion to start integration of new targets

    initial discussion to start integration of new targets

    As promised, I am now ready to integrate the new ANTLR target languages you folks have been working on. This issue is meant to get everybody in sync, check status, and discuss the proper order of integration and resolve issues etc.

    There are two administrative details to get out of the way first:

    1. Please let me know if there is another github user that should be added to one of the categories. Or, of course, if you would like your user ID removed from this discussion.
    2. Nothing can be merged into antlr/antlr4 unless every single committer has added themselves to the contributors.txt file. It's onerous, particularly for simple commits, but it is requirement for anything merged into the master. Eclipse foundation lawyers tell me that we have one of the cleanest licenses out there and it contributes to ANTLR's widespread use because companies are not afraid to use the software. See the genesis of such heinous requirements in SCO v IBM. This means lead target authors have to go back through their committers list quickly and ask them to sign the contributors file with a new commit. Or, they can remove that commit and enter their own version of the functionality, being careful not to violate copyright on the previous.

    As we proceed, please keep in mind that I have a difficult role, balancing the needs of multiple targets and keeping discussions in the civil and practical zone. Decisions I make come from the perspective of over 25 years managing and leading this project. I look forward to incorporating your hard work into the main antlr repo.

    C++ current location

    • @mike-lischke
    • @DanMcLaughlin
    • @nburles
    • @davesisson

    Go current location, previous discussion

    • @pboyer

    Swift current location: unclear, previous discussion

    • @jeffreyguenther
    • @hanjoes
    • @janyou
    • @ewanmellor

    Likely interested/supporting humans (scraped from github issues):

    • @RYDB3RG
    • @wjkohnen
    • @willfaught
    • @parrt
    • @sharwell
    • @ericvergnaud
    type:improvement target:swift target:cpp target:go 
    opened by parrt 84
  • .NET Core Support

    .NET Core Support

    I added a new solution for the .NET Runtime, which supports .NET Core. Also made some code changes, to fix issues with missing APIs.

    type:improvement comp:build target:csharp 
    opened by lecode-official 80
  • PHP Target

    PHP Target

    I would like to propose introducing a PHP target and runtime.

    Is anyone interested in joining me to implement a PHP target?

    opened by marcospassos 75
  • Swift Target

    Swift Target

    I did a quick search and I didn't see anything written about this yet. What's the likelihood of a Swift target for ANTLR?

    There are C#, Javascript, and Python targets at the moment.

    What does it take to implement a target? Given that Swift is more Java-like, it seems like it should be possible. Maybe start with a code translator if there is one for (Java to Swift), and iterate towards a more idiomatic implementation.

    type:question 
    opened by jeffreyguenther 69
  • Add a new CharStream that converts the symbols to upper or lower case.

    Add a new CharStream that converts the symbols to upper or lower case.

    This is useful for many of the case insensitive grammars found at https://github.com/antlr/grammars-v4/ which assume the input would be all upper or lower case. Related discussion can be found at https://github.com/antlr/antlr4test-maven-plugin/issues/1

    It would be used like so:

    input, _ := antlr.NewFileStream("filename")
    
    in = antlr.NewCaseChangingStream(is, true) // true forces upper case symbols, false forces lower case.
    
    lexer := parser.NewAbnfLexer(in)
    

    While writing this, I found other people have written their own similar implementations (go, java). It makes sense to place this in the core, so everyone can use it.

    I would love for the grammar to have a option that says the lexer should upper/lower case all input, and then this code could be moved into the generated Lexer, and no user would need to explicitly use a CaseChangingStream (similar to what's discussed in #1002).

    lexers comp:runtime target:java target:javascript target:go 
    opened by bramp 69
  • [CSharp] #2021 fixes nuget packaging options to avoid missing dll exceptions

    [CSharp] #2021 fixes nuget packaging options to avoid missing dll exceptions

    @ericvergnaud Hi, I modified csproj options a bit, now I can get a working nuget package locally without the issue we described in #2021. I added .net 3.5 as a target to "main" csproj along with netstandard, since it's easier to keep track of requirements for both sets of api's when editing code and, ideally, both targets can be packed into a nuget package with a single command. Right now it's possible only on Windows via msbuild /t:pack or Visual Studio; unfortunately, due to https://github.com/Microsoft/msbuild/issues/1333, right now dotnet build pack does not work for .net 3.5 target the way it should, so I adjusted the existing script to create packages from .nuspec and different solutions for different targets.

    comp:build target:csharp 
    opened by listerenko 68
  • A few updates to the Unicode documentation.

    A few updates to the Unicode documentation.

    It should be made clear that the recommended use of CharStreams.fromPath() is a Java-only solution. The other targets just have their ANTLRInputStream class extended to support full Unicode.

    comp:doc 
    opened by mike-lischke 61
  • [Swift] remove usages of return where it can be omitted

    [Swift] remove usages of return where it can be omitted

    Removed usages of return where it can be omitted, i.e., in all single-expression closures, getters, and functions.

    Applied by AppCode

    Swift.stg was processed manually

    opened by martinvw 2
  • [Swift] Changes to Mutex-es to guard the staticly shared [DFA]

    [Swift] Changes to Mutex-es to guard the staticly shared [DFA]

    Request for review: @janyou, @ewanmellor, @hanjoes, @rmehta33

    By sharing the mutex in the same way as the DFA array it becomes possible to run in parallel

    Fixes #3271

    opened by martinvw 0
  • Visual Studio 2022 Preview 3.1 crashes when using Antlr generated C# code

    Visual Studio 2022 Preview 3.1 crashes when using Antlr generated C# code

    Hello world,

    Before I describe the issue, I want to be clear that I think this is a problem that Microsoft should be looking at, but per their request, I'm posting the issue here as well.

    I also want to post that I do have a workaround, described at the above link as well.

    I have a repository that also illustrates the issue here.

    Basically, when using the TSQL grammar from here, if generated using Antlr 4.6.6, when using TSQLParser.cs, Visual Studio 2022 Preview 3.1 will crash if the project is targeting .NET 6.

    When I generated the same file using Antlr 4.9.2, the project does not crash.

    It's also odd to me that the problem does not manifest itself if I target .NET 5 instead of .NET 6, which is why I thought Microsoft should look at it, but well... you can see the response above.

    This seems like it's related to issue #2660, but that is closed already.

    Thanks, R.

    opened by dynamoRando 5
  • Option `encoding` only allows valid input

    Option `encoding` only allows valid input

    Issue: https://github.com/antlr/antlr4/issues/3272

    opened by haeungun 0
  • Option `-package` allows identifier with space and other not allowed chars

    Option `-package` allows identifier with space and other not allowed chars

    For instance, -package "invalid package" generates not compilable code.

    I suggest using the following regex: ^([a-zA-Z_][a-zA-Z\d_]*)$ for packages validation. Maybe extend this regex to allow utf8 identifiers because modern languages support such chars for identifiers.

    Maybe the same for -encoding

    opened by KvanTTT 2
  • Swift Target Crashes with Multi-Threading

    Swift Target Crashes with Multi-Threading

    • [x] I am not submitting a question on how to use ANTLR; instead, go to antlr4-discussion google group or ask at stackoverflow
    • [x] I have done a search of the existing issues to make sure I'm not sending in a duplicate

    Reproducing Error

    Running via Xcode for iOS 13 Project Swift Version: 4.2 Compiled swift target from 4.9.2 release

    Grammar

    grammar math;
    
    NUMBER : [0-9] ;
    WS : [ \r\n\t] + -> skip ;
    
    operation : l=NUMBER op='+' r=NUMBER  ;
    

    Swift Code

    for _ in 0...10 {
        DispatchQueue.global().async {
            let chars = ANTLRInputStream("1+1")
            let lexer = mathLexer(chars)
            
            let tokenStream = CommonTokenStream(lexer)
            let parser = try! mathParser(tokenStream)
            
            let node = try? parser.operation()
        }
    }
    

    What Happened

    The program crashes in LexerATNSimulator.swift on line 721 with EXC_BAD_ACCESS. There was a double free at some point in the stack.

    Any help is greatly appreciated! Hopefully this is just user error.

    opened by rmehta33 4
  • JavaScript Runtime bug

    JavaScript Runtime bug

    I already posted the complete issue on StackOverflow and someone pointed out that this should be posted here.

    This is the code for NodeJS file

    import {createServer} from 'http';
    import antlr4 from 'antlr4';
    import fs from 'fs'
    import HelloLexer from "./gen/CobolLexer.js";
    import HelloParser from "./gen/CobolParser.js";
    import CustomCobolListener from "./CustomCobolListener.js"
    
    const {CommonTokenStream, InputStream} = antlr4;
    let Filename = './COBOLCodes/AROMA96.CBL'
    createServer((req, res) => {
        res.writeHead(200, {"Content-Type": "text/json"});
       var InputFromFile = '';
        try {
            let data = fs.readFileSync(Filename, 'utf8');
            InputFromFile = data;
        } catch (e) {
            console.log(e);
        }
        console.log("server running");
        try {
            var chars = new InputStream(InputFromFile, true);
            var lexer = new HelloLexer(chars);
            var tokens = new CommonTokenStream(lexer);
            var parser = new HelloParser(tokens);
            parser.buildParseTrees = true;
            var tree = parser.startRule();
            var htmlChat = new CustomCobolListener(res,Filename.substr(Filename.lastIndexOf('/') + 1,Filename.length - Filename.lastIndexOf('/') - 5));
            antlr4.tree.ParseTreeWalker.DEFAULT.walk(htmlChat, tree);
        }catch (e) {
            console.log(e)
        }
        // res.write("</body></html>");
        res.end();
    }).listen(1337);
    

    Please click this Link for COBOL grammar

    The COBOL code I am trying to parse is as follows

    IDENTIFICATION DIVISION.
    PROGRAM-ID.  Aroma96exam.
    
    ENVIRONMENT DIVISION.
    INPUT-OUTPUT SECTION.
    FILE-CONTROL.
       SELECT Oil-Details-File ASSIGN TO "ODF.DAT"
              ORGANIZATION IS INDEXED   
              ACCESS MODE IS DYNAMIC
              RECORD KEY IS Oil-Num-ODF
              ALTERNATE RECORD KEY IS Oil-Name-ODF
                          WITH DUPLICATES
              FILE STATUS IS ODF-Status.
    
       SELECT Oil-Stock-File ASSIGN TO "OSF.DAT"
              ORGANIZATION IS RELATIVE   
              ACCESS MODE IS DYNAMIC
              RELATIVE KEY IS Rel-Rec-Num 
              FILE STATUS IS OSF-Status.
    
       SELECT Trans-File ASSIGN TO "TRANS.DAT"
            ORGANIZATION IS LINE SEQUENTIAL.   
    
       SELECT Report-File ASSIGN TO "OILSTOCK.RPT".
    
       SELECT Error-File ASSIGN TO "ERROR.DAT"
            ORGANIZATION IS LINE SEQUENTIAL. 
    
      
    
    DATA DIVISION.
    FILE SECTION.
    FD Oil-Details-File.
    01 ODF-Rec.
       88 End-Of-ODF		VALUE HIGH-VALUES.
       02 Oil-Num-ODF               PIC 9(4).
       02 Oil-Name-ODF              PIC X(20).
       02 Unit-Size-ODF		PIC 9(2).
       02 Unit-Cost-ODF		PIC 99V99.
    
    FD Oil-Stock-File.
    01 OSF-Rec.
       02 Oil-Num-OSF		PIC 9(4).
       02 Qty-In-Stock-OSF		PIC 9(5).
    
    FD Trans-File.
    01 Trans-Rec.
       88 End-Of-Trans		VALUE HIGH-VALUES.
       02 Type-Code			PIC 9.
          88 Add-To-Stock		VALUE 1.
          88 Remove-From-Stock      VALUE 2.
       02 Oil-Num.
          03  Rel-Rec-Num		PIC 9(3).
          03  FILLER		PIC 9.
       02 Qty			PIC 9(5).
    
    FD Error-File.
    01 Error-Rec			PIC X(10).
    
    FD Report-File REPORT IS Oil-Stock-Report.
    
    
    WORKING-STORAGE SECTION.
    01 Status-Codes.
       02 ODF-Status                PIC X(2).
       02 OSF-Status                PIC X(2).
          88 No-Error-Found		VALUE "00".
          88 Rec-Not-Found		VALUE "23".
    
    
    01 Stock-Value			PIC 9(5)V99.
    
    REPORT SECTION.
    RD Oil-Stock-Report
       CONTROLS ARE FINAL
                    Oil-Name-ODF
       PAGE LIMIT IS 66
       HEADING 2
       FIRST DETAIL 8
       LAST DETAIL 50
       FOOTING 55.
    
    01 TYPE IS REPORT HEADING.
       02 LINE 2.
          03 COLUMN 15		PIC X(18) VALUE "OIL  STOCK  REPORT".
       02 LINE 3.
          03 COLUMN 13		PIC X(22) VALUE ALL "-".
       
    01 TYPE IS PAGE HEADING.
       02 LINE 6.
          03 COLUMN 03		PIC X(9)  VALUE "OIL  NAME".
          03 COLUMN 23		PIC X(4)  VALUE "OIL#".
          03 COLUMN 29		PIC X(4)  VALUE "SIZE".
          03 COLUMN 36		PIC X(3)  VALUE "QTY".
          03 COLUMN 44		PIC X(11) VALUE "STOCK VALUE".
    
    01 Stock-Detail-Line TYPE IS DETAIL.
       02 LINE IS PLUS 2.
          03 COLUMN 01		PIC X(20) SOURCE Oil-Name-ODF GROUP INDICATE.
          03 COLUMN 23		PIC 9(4)  SOURCE Oil-Num-ODF.
          03 COLUMN 30		PIC 99    SOURCE Unit-Size-ODF.
          03 COLUMN 35              PIC ZZ,ZZ9 SOURCE Qty-In-Stock-OSF.
          03 COLUMN 44              PIC $$$,$$9.99 SOURCE Stock-Value.
    
    01 TYPE IS CONTROL FOOTING Oil-Name-ODF NEXT GROUP PLUS 1.
       02 LINE IS PLUS 2.
          03 COLUMN 27		PIC X(15) VALUE "TOTAL OIL VALUE".
          03 Oil-Val COLUMN 44      PIC $$$$,$$9.99 SUM Stock-Value.
    
    01 TYPE IS CONTROL FOOTING FINAL.
       02 LINE IS PLUS 3.
          03 COLUMN 27              PIC X(17) VALUE "TOTAL STOCK VALUE".
          03 COLUMN 46              PIC $$,$$$,$$9.99 SUM Oil-Val.
     
    
    PROCEDURE DIVISION.
    Begin.
       OPEN I-O Oil-Details-File.
       OPEN I-O Oil-Stock-File.
       OPEN OUTPUT Error-File.
       OPEN INPUT Trans-File.
       READ Trans-File 
          AT END SET End-Of-Trans TO TRUE
          END-READ.
       PERFORM Process-Transactions UNTIL End-Of-Trans.
    
       CLOSE Error-File.
       CLOSE Trans-File.  
       OPEN OUTPUT Report-File.
       INITIATE Oil-Stock-Report.
    
       MOVE SPACES TO Oil-Name-ODF.
       START Oil-Details-File 
          KEY IS GREATER THAN Oil-Name-ODF
          INVALID KEY DISPLAY "Start Error FS = " ODF-Status
       END-START.
       READ Oil-Details-File NEXT RECORD
          AT END SET End-Of-ODF TO TRUE
       END-READ.
       PERFORM Print-Stock-Report UNTIL End-Of-ODF.
       TERMINATE Oil-Stock-Report.
       CLOSE Oil-Details-File.
       CLOSE Oil-Stock-File.
       STOP RUN.
    
    Process-Transactions.
       READ Oil-Stock-File
           INVALID KEY DISPLAY "OSF rec not found FS = " OSF-Status
       END-READ.
       IF No-Error-Found 
          EVALUATE TRUE
            WHEN Add-To-Stock ADD Qty TO Qty-In-Stock-OSF
            WHEN Remove-From-Stock SUBTRACT Qty FROM Qty-In-Stock-OSF
            WHEN OTHER DISPLAY "Type code not 1 or 2 Rec = " Trans-Rec
          END-EVALUATE
          REWRITE OSF-Rec
             INVALID KEY DISPLAY "Problem on REWRITE FS= " OSF-Status
          END-REWRITE
        ELSE IF Rec-Not-Found 
                    WRITE Error-Rec FROM Trans-Rec
             END-IF
       END-IF.  
       READ Trans-File 
          AT END SET End-Of-Trans TO TRUE
       END-READ. 
    
    Print-Stock-Report.
       MOVE Oil-Num-ODF TO Oil-Num
       READ Oil-Stock-File
          INVALID KEY DISPLAY "Error on reading OSF SF= " OSF-Status
       END-READ.
       COMPUTE Stock-Value = Unit-Cost-ODF * Qty-In-Stock-OSF.
       GENERATE Stock-Detail-Line. 
       READ Oil-Details-File NEXT RECORD
          AT END SET End-Of-ODF TO TRUE
       END-READ.
    

    Here is an image of a tree formed after parsing the above COBOL code through the given grammar, which clearly shows that TRUE is identified as a booleanliteral which is as expected antlr error

    but after I run the code on my local nodejs it gives a runtime error as follows

    /bin/node /mnt/sdb3/antlr/antlr.js
    server running
    line 125:33 missing {ABORT, AS, ASCII, ASSOCIATED_DATA, ASSOCIATED_DATA_LENGTH, ATTRIBUTE, AUTO, AUTO_SKIP, BACKGROUND_COLOR, BACKGROUND_COLOUR, BEEP, BELL, BINARY, BIT, BLINK, BOUNDS, CAPABLE, CCSVERSION, CHANGED, CHANNEL, CLOSE_DISPOSITION, COBOL, COMMITMENT, CONTROL_POINT, CONVENTION, CRUNCH, CURSOR, DEFAULT, DEFAULT_DISPLAY, DEFINITION, DFHRESP, DFHVALUE, DISK, DONTCARE, DOUBLE, EBCDIC, EMPTY_CHECK, ENTER, ENTRY_PROCEDURE, ERASE, EOL, EOS, ESCAPE, EVENT, EXCLUSIVE, EXPORT, EXTENDED, FOREGROUND_COLOR, FOREGROUND_COLOUR, FULL, FUNCTIONNAME, FUNCTION_POINTER, GRID, HIGHLIGHT, IMPLICIT, IMPORT, INTEGER, IS, JUST, JUSTIFIED, KANJI, KEPT, KEYBOARD, LANGUAGE, LB, LD, LEFTLINE, LENGTH_CHECK, LIBACCESS, LIBPARAMETER, LIBRARY, LIST, LOCAL, LOCK, LONG_DATE, LONG_TIME, LOWER, LOWLIGHT, MMDDYYYY, NAMED, NATIONAL, NATIONAL_EDITED, NETWORK, NO_ECHO, NUMERIC_DATE, NUMERIC_TIME, OCCURS, ODT, ORDERLY, OVERLINE, OWN, PASSWORD, PORT, PRINTER, PRIVATE, PROCESS, PROGRAM, PROMPT, READER, REMOTE, REAL, RECEIVED, RECURSIVE, REF, REMOVE, REQUIRED, REVERSE_VIDEO, SAVE, SECURE, SHARED, SHAREDBYALL, SHAREDBYRUNUNIT, SHARING, SHORT_DATE, SYMBOL, TASK, THREAD, THREAD_LOCAL, TIMER, TODAYS_DATE, TODAYS_NAME, TRUE, TRUNCATED, TYPEDEF, UNDERLINE, VIRTUAL, WAIT, YEAR, YYYYMMDD, YYYYDDD, ZERO_FILL, '66', '77', '88', INTEGERLITERAL, NUMERICLITERAL, IDENTIFIER} at 'TRUE'
    

    and as a StackOverflow user pointed out that the code runs fine in java.

    File Attachment to reproduce the code. Git.zip

    opened by Gravity-I-Pull-You-Down 3
  • Denial-of-Service against antlr4-maven-plugin with malformed grammar options

    Denial-of-Service against antlr4-maven-plugin with malformed grammar options

    Version: 4.9.2

    Smallest Grammar

    parser grammar doom;
    
    options {
        tokenVocab=doom;
    }
    
    

    Expected

    Something like the command line tool error:

    $ java -jar /tmp/antlr-4.9.2-complete.jar doom.g4
    error(99): doom.g4::: grammar doom has no non-fragment rules
    

    Alternatively, some other recoverable, non-StackOverflowError exception.

    Actual

    Putting this file in an antlr4-maven-plugin project at src/main/antlr4/doom.g4 causes:

    Exception in thread "main" java.lang.StackOverflowError
            at java.base/java.util.HashMap.hash(HashMap.java:339)
            at java.base/java.util.LinkedHashMap.get(LinkedHashMap.java:440)
            at org.antlr.v4.misc.Graph.getNode(Graph.java:51)
            at org.antlr.mojo.antlr4.GrammarDependencies.explore(GrammarDependencies.java:206)
            at org.antlr.mojo.antlr4.GrammarDependencies.explore(GrammarDependencies.java:208)
            at org.antlr.mojo.antlr4.GrammarDependencies.explore(GrammarDependencies.java:208)
            at org.antlr.mojo.antlr4.GrammarDependencies.explore(GrammarDependencies.java:208)
            at org.antlr.mojo.antlr4.GrammarDependencies.explore(GrammarDependencies.java:208)
            at org.antlr.mojo.antlr4.GrammarDependencies.explore(GrammarDependencies.java:208)
            at org.antlr.mojo.antlr4.GrammarDependencies.explore(GrammarDependencies.java:208)
            at org.antlr.mojo.antlr4.GrammarDependencies.explore(GrammarDependencies.java:208)
            at org.antlr.mojo.antlr4.GrammarDependencies.explore(GrammarDependencies.java:208)
            at org.antlr.mojo.antlr4.GrammarDependencies.explore(GrammarDependencies.java:208)
    

    Interestingly, when editing antlr4-maven projects via m2e in Eclipse, this makes Eclipse unusable as it loads the grammar, calls m2e(?), gets the SOE, and wants to restart, preventing access to editing the offending grammar in Eclipse, hence why I classify this as a DoS, and not just a wrong error type.

    opened by byteit101 0
  • about Cpp runtime token factory

    about Cpp runtime token factory

    I'm not sure if this is a issue : in Lexer the _factory has the following type

    TokenFactory<CommonToken> *_factory;
    

    and the setTokenFactory has the following definition

    template<typename T1>
    void setTokenFactory(TokenFactory<T1> *factory)  {
      this->_factory = factory;
    }
    

    For any T1 that is not CommonToken this function will fail. In my understanding , the TokenFactory should not be a template. The proposed modification is as follow

    // TokenFactory.h
    class ANTLR4CPP_PUBLIC TokenFactory {
    ...
    virtual std::unique_ptr<Token> create ...;
    virtual std::unique_ptr<Token> create ...;
    };
    // Lexer.h
    ...
    TokenFactory *_factory;
    ...
    void setTokenFactory(TokenFactory *factory)  {
      this->_factory = factory;
    }
    ...
    
    opened by oldoldman 1
  • Drop Python2 runtime support

    Drop Python2 runtime support

    Python2 is not officially supported since January 1st, 2020: https://www.python.org/doc/sunset-python-2/

    I think now it's time to stop supporting it in ANTLR because it consumes additional resources.

    Also, there are a lot of code in common between Python2 and Python3 runtimes.

    opened by KvanTTT 2
Releases(4.9.2)
Owner
Antlr Project
The Project organization for the ANTLR parser generator.
Antlr Project
The fast scanner generator for Java™ with full Unicode support

JFlex JFlex is a lexical analyzer generator (also known as scanner generator) for Java. JFlex takes as input a specification with a set of regular exp

JFlex 410 Sep 17, 2021
A simple hierarchical state machine compiler that generates C.

Makina is a hierarchical state machine source-to-source translator. It takes state machine descriptions as input and produces C language implementations of those state machines.

Colin Holzman 112 Aug 6, 2021