自己动手写分布式搜索引擎
上QQ阅读APP看书,第一时间看更新

2.5.1 运算

BasicOperations中包含一些Automaton运算。Intersection()方法用于求交集。例如:

        //创建一个可以接收A~W字符的自动机
        Automaton a = BasicAutomata.makeCharRange('A', 'W');
        Automaton b = BasicAutomata.makeCharRange('D', 'W');
        Automaton c = BasicOperations.intersection(a, b);
        System.out.println(BasicOperations.run(c, "A")); //输出false
        System.out.println(BasicOperations.run(c, "W")); //输出true

union()方法用于求并集。例如:

        Automaton a = BasicAutomata.makeCharRange('D', 'W');
        Automaton b = BasicAutomata.makeCharRange('A', 'C');
        Automaton c = BasicOperations.union(a, b);
        System.out.println(BasicOperations.run(c, "A")); // 输出true
        System.out.println(BasicOperations.run(c, "W")); // 输出true

concatenate()方法用于连接两个自动机。例如:

        Automaton a = BasicAutomata.makeChar('W');
        Automaton b = BasicAutomata.makeChar('W');
        Automaton c = BasicOperations.concatenate(a, b);
        System.out.println(BasicOperations.run(c, "WW")); // 输出true
        System.out.println(BasicOperations.run(c, "WWW")); // 输出false

repeat()方法表示重复0次或者多次。例如重复字符“W”零次或者多次:

        Automaton a = BasicAutomata.makeChar('W');
        Automaton c = BasicOperations.repeat(a);
        System.out.println(BasicOperations.run(c, "WW")); // 输出true
        System.out.println(BasicOperations.run(c, "WWW")); // 输出true

determinize()方法表示把NFA转换成DFA。

        a.deberminize();

复杂的正则表达式难以维护。“[yY]es”这样的正则表达式可以写成如下等价的Automaton:

        Automaton a = BasicAutomata.makeChar('Y'); //创建一个字符W组成的自动机
        Automaton b = BasicAutomata.makeChar('y');
        Automaton c = a.union(b);
        Automaton d = c.concatenate(BasicAutomata.makeString("es"));
        //使用它
        System.out.println(BasicOperations.run(d, "y yes")); //输出true