Calcite 是 Apache 的开源框架,单词直译为“方解石”,是一个动态数据管理框架,官方给出的说明是可以管理任何数据,比如一个满是 csv 数据的文件夹,使用 Calcite 框架,就可以把这个文件夹当作是一个数据库来操作,官方使用说明在这个地址:https://calcite.apache.org/docs/tutorial.html

在其框架首页,简要介绍了框架提供的三大功能:

  • Standard SQL:标准SQL解析、验证、JDBC驱动
  • Query optimization:查询优化
  • Any data, anywhere:连接到任何第三方数据源

使用 SQL 解析器从 SQL 语句中解析表名

maven 依赖

<dependency>
   <groupId>junit</groupId>
   <artifactId>junit</artifactId>
   <version>4.13.2</version>
   <scope>test</scope>
</dependency>
<dependency>
     <groupId>org.apache.calcite</groupId>
     <artifactId>calcite-core</artifactId>
     <version>1.31.0</version>
</dependency>

实现代码

public static List<String> extractTableNameList(String sql) throws SqlParseException {
        SqlParser parser = SqlParser.create(sql);
        SqlNode parsed = parser.parseQuery();
        List<String> tableNameList = new ArrayList<>();

        parseSqlNode(parsed, tableNameList);

        return tableNameList;
    }

    private static void parseFromNode(SqlNode from, List<String> tableNameList){
        SqlKind kind = from.getKind();
        switch (kind) {
            case IDENTIFIER:
                //最终的表名
                SqlIdentifier sqlIdentifier = (SqlIdentifier) from;
                tableNameList.add(sqlIdentifier.toString());
                break;
            case AS:
                SqlBasicCall sqlBasicCall = (SqlBasicCall) from;
                SqlNode selectNode = sqlBasicCall.getOperandList().get(0);
                parseSqlNode(selectNode, tableNameList);
                break;
            case JOIN:
                SqlJoin sqlJoin = (SqlJoin) from;
                SqlNode left = sqlJoin.getLeft();
                parseFromNode(left, tableNameList);
                SqlNode right = sqlJoin.getRight();
                parseFromNode(right, tableNameList);
                break;
            case SELECT:
                parseSqlNode(from, tableNameList);
                break;
        }
    }

    private static void parseSqlNode(SqlNode sqlNode, List<String> tableNameList) {
        SqlKind kind = sqlNode.getKind();
        switch (kind) {
            case IDENTIFIER:
                parseFromNode(sqlNode, tableNameList);
                break;
            case SELECT:
                SqlSelect select = (SqlSelect) sqlNode;
                parseFromNode(select.getFrom(), tableNameList);
                break;
            case UNION:
                ((SqlBasicCall) sqlNode).getOperandList().forEach(node -> {
                    parseSqlNode(node, tableNameList);
                });

                break;
            case ORDER_BY:
                handlerOrderBy(sqlNode, tableNameList);
                break;
        }
    }

    private static void handlerOrderBy(SqlNode node, List<String> tableNameList) {
        SqlOrderBy sqlOrderBy = (SqlOrderBy) node;
        SqlNode query = sqlOrderBy.query;
        parseSqlNode(query, tableNameList);
    }

对于单表的查询,解析出的 SqlKind 类型为 SELECT,这种情况就可以直接处理 From 就可以,对于多表查询,解析出的 SqlKind 类型为 JOIN,这种情况要进一步处理。

单元测试

用以下两个 SQL 语句进行测试:

select id,name from person
select a.id,b.name,b.label from person a, person b where a.id = b.id;

输出分别为:[PERSON]、[PERSON, PERSON]

处理的核心为 SqlNode,这是个抽象类,不同类型的查询有不同的实现,本例中主要涉及到的类如下图:



参考:

https://calcite.apache.org/

https://calcite.apache.org/docs/tutorial.html

https://www.cxyzjd.com/article/junjun56/115870189