ページ更新: 2004-07-11 (日) (5242日前)

JFlex 1.4の付属マニュアル。 暇を見て訳すかもしれないので、貼っておく。

[編集]

Contents #

[編集]

1 Introduction はじめに #

JFlex is a lexical analyzer generator for Javawritten in Java.

JFlexはJavaで書かれたレキシカル・アナライザ・ジェネレータである。

It is also a rewrite of the very useful tool JLex [3] which was developed by Elliot Berk at Princeton University.

これはまた、大変役に立つツールであるJLex(プリンストン大学のElliot Berkが開発した)を書き直したものである。

As Vern Paxon states for his C/C++ tool flex [11]:

They do not share any code though.

[編集]

1.1 Design goals 設計の目標 #

The main design goals of JFlex are:

  • Full unicode support
  • Fast generated scanners
  • Fast scanner generation
  • Convenient specification syntax
  • Platform independence
  • JLex compatibility

JFlexの主な設計目標は:

  • 完全なUnicodeのサポート
  • 生成したスキャナが高速であること
  • スキャナの生成が高速であること
  • 便利な構文を持つこと
  • プラットフォームに依存しないこと
  • JFlexとの互換性
[編集]

1.2 About this manual このマニュアルについて #

[編集]

2 Installing and Running JFlex #

[編集]

2.1 Installing JFlex #

[編集]

2.1.1 Windows #

[編集]

2.1.2 Unix with tar archive #

[編集]

2.1.3 Linux with RPM #

[編集]

2.2 Running JFlex #

[編集]

3 A simple Example: How to work with JFlex #

[編集]

3.1 Code to include #

[編集]

3.2 Options and Macros #

[編集]

3.3 Rules and Actions #

[編集]

3.4 How to get it going #

[編集]

4 Lexical Specifications #

[編集]

4.1 User code #

[編集]

4.2 Options and declarations #

[編集]

4.2.1 Class options and user class code #

[編集]

4.2.2 Scanning method #

[編集]

4.2.3 The end of file #

[編集]

4.2.4 Standalone scanners #

[編集]

4.2.5 CUP compatibility #

[編集]

4.2.6 BYacc/J compatibility #

[編集]

4.2.7 Code generation #

[編集]

4.2.8 Character sets #

%7bit

Causes the generated scanner to use an 7 bit input character set (character codes 0-127).

Because this is the default value in JLex, JFlex also defaults to 7 bit scanners.

If an input character with a code greater than 127 is encountered in an input at runtime, the scanner will throw an ArrayIndexOutofBoundsException.

Not only because of this, you should consider using the %unicode directive.

See also section 5 for information about character encodings.

%full
%8bit

Both options cause the generated scanner to use an 8 bit input character set (character codes 0-255).

If an input character with a code greater than 255 is encountered in an input at runtime, the scanner will throw an ArrayIndexOutofBoundsException.

Note that even if your platform uses only one byte per character, the Unicode value of a character may still be greater than 255.

If you are scanning text files, you should consider using the %unicode directive.

See also section 5 for more information about character encodings.

%unicode

Both options cause the generated scanner to use the full 16 bit Unicode input character set (character codes 0-65535).

There will be no runtime overflow when using this set of input characters.

%unicode does not mean that the scanner will read two bytes at a time.

What is read and what constitutes a character depends on the runtime platform.

See also section 5 for more information about character encodings.

%caseless
%ignorecase

This option causes JFlex to handle all characters and strings in the specification as if they were specified in both uppercase and lowercase form.

このオプションは、JFlexにすべての文字と文字列を、大文字と小文字の両方 が指定されたかのように扱わせる。

This enables an easy way to specify a scanner for a language with case insensitive keywords.

このことは、キーワードの文字の大小を区別しない言語のスキャナを 容易に実装できる。

The string ”break”in a specification is for instance handled like the expression ([bB][rR][eE][aA][kK]).

文字列 "break" は式 ([bB][rR][eE][aA][kK]) と同じ

The %caseless option does not change the matched text and does not effect character classes.

%caselessオプションはマッチした文を変更しないし、文字列クラスにも 影響を及ぼさない。

So [a] still only matches the character a and not A, too.

[a]は文字 a にのみマッチし、Aにはマッチしない。

Which letters are uppercase and which lowercase letters, is defined by the Unicode standard and determined by JFlex with the Java methods Character.toUpperCase and Character.toLowerCase.

文字が大文字かあるいは小文字か、を決定するのはUnicodeであり、 Javaの Character.toUpperCase と Character.toLowerCase が決定している。

In JLex compatibility mode (--jlex switch on the command line), %caseless and %ignorecase also affect character classes.

JLex互換モード (--jlex スイッチをコマンドラインで指定したとき)は、 %caseless と %ignorecase は文字クラスにも影響を及ぼす。

[編集]

4.2.9 Line, character and column counting #

[編集]

4.2.10 Obsolete JLex options #

[編集]

4.2.11 State declarations #

[編集]

4.2.12 Macro definitions #

[編集]

4.3 Lexical rules #

[編集]

4.3.1 Syntax #

[編集]

4.3.2 Semantics #

[編集]

4.3.3 How the input is matched #

[編集]

4.3.4 The generated class #

[編集]

4.3.5 Scanner methods and fields accessible in actions (API) #

[編集]

5 Encodings, Platforms, and Unicode #

[編集]

5.1 The Problem #

[編集]

5.2 Scanning text files #

[編集]

5.3 Scanning binaries #

[編集]

6 A few words on performance #

[編集]

6.1 Comparison of JLex and JFlex #

[編集]

6.2 How to write a faster specification #

[編集]

7 Porting Issues #

[編集]

7.1 Porting from JLex #

[編集]

7.2 Porting from lex/flex #

[編集]

7.2.1 Basic structure #

[編集]

7.2.2 Macros and Regular Expression Syntax #

[編集]

7.2.3 Lexical Rules #

[編集]

8 Working together #

[編集]

8.1 JFlex and CUP #

[編集]

8.1.1 CUP version 0.10j #

[編集]

8.1.2 Using existing JFlex/CUP specifications with CUP 0.10j #

[編集]

8.1.3 Using older versions of CUP #

[編集]

8.2 JFlex and BYacc/J #

[編集]

9 Bugs and Deficiencies #

[編集]

9.1 Deficiencies #

[編集]

9.2 Bugs #

[編集]

10 Copying and License #

[編集]

References #