Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refresh of asynchronous materialized view fails if base table is partitioned using expression partitioning. #54202

Open
njupudi-eightfold opened this issue Dec 20, 2024 · 1 comment · May be fixed by #54278
Assignees
Labels
type/bug Something isn't working

Comments

@njupudi-eightfold
Copy link

Refresh of asynchronous materialized view fails if base table is partitioned using expression partitioning.

Facts:

  1. Cluster is in shared-data mode.
  2. Base table is partitioned on string data type column using expression partitioning
  3. Asynchronous materialized view is also partitioned on same column as base table using expression partitioning.

Steps to reproduce the behavior (Required)

CREATE TABLE `test_base_tbl` (
  col1 varchar(100),
  col2 varchar(100),
  col3 bigint
) ENGINE=OLAP
PRIMARY KEY (col1)
PARTITION BY (col1)
DISTRIBUTED BY HASH(col1) BUCKETS 5
ORDER BY (col2)
PROPERTIES (
  "persistent_index_type" = "CLOUD_NATIVE"
);

insert into test_base_tbl values
('a.com', 'val1', 100),
('A.com', 'val1', 200),
('A.Com', 'val1', 300)

CREATE MATERIALIZED VIEW `test_async_mv`
PARTITION BY (col1)
DISTRIBUTED BY HASH(col1)
REFRESH MANUAL
PROPERTIES (
"query_rewrite_consistency" = "LOOSE",
"session.enable_spill" = "true",
"datacache.enable" = "true",
"enable_async_write_back" = "false",
"storage_volume" = "builtin_storage_volume"
)
AS SELECT col1, sum(col3) from test_base_tbl group by col1;

Expected behavior (Required)

Refresh of this asynchronous materialized view should be successful.

Real behavior (Required)

Refresh of this asynchronous materialized view fails with the following error:

Refresh materialized view test_async_mv failed after retrying 1 times(try-lock 0 times), error-msg : com.starrocks.sql.analyzer.SemanticException: Getting analyzing error. Detail message: Duplicate partition name pA2ecom.
	at com.starrocks.sql.analyzer.AlterTableClauseAnalyzer.visitAddPartitionClause(AlterTableClauseAnalyzer.java:1079)
	at com.starrocks.sql.analyzer.AlterTableClauseAnalyzer.visitAddPartitionClause(AlterTableClauseAnalyzer.java:111)
	at com.starrocks.sql.ast.AddPartitionClause.accept(AddPartitionClause.java:80)
	at com.starrocks.sql.ast.AstVisitor.visit(AstVisitor.java:71)
	at com.starrocks.sql.analyzer.AlterTableClauseAnalyzer.analyze(AlterTableClauseAnalyzer.java:119)
	at com.starrocks.scheduler.mv.MVPCTRefreshListPartitioner.addListPartitions(MVPCTRefreshListPartitioner.java:378)
	at com.starrocks.scheduler.mv.MVPCTRefreshListPartitioner.syncAddOrDropPartitions(MVPCTRefreshListPartitioner.java:113)
	at com.starrocks.scheduler.PartitionBasedMvRefreshProcessor.syncPartitions(PartitionBasedMvRefreshProcessor.java:914)
	at com.starrocks.scheduler.PartitionBasedMvRefreshProcessor.doRefreshMaterializedView(PartitionBasedMvRefreshProcessor.java:413)
	at com.starrocks.scheduler.PartitionBasedMvRefreshProcessor.doRefreshMaterializedViewWithRetry(PartitionBasedMvRefreshProcessor.java:368)
	at com.starrocks.scheduler.PartitionBasedMvRefreshProcessor.doMvRefresh(PartitionBasedMvRefreshProcessor.java:327)
	at com.starrocks.scheduler.PartitionBasedMvRefreshProcessor.processTaskRun(PartitionBasedMvRefreshProcessor.java:199)
	at com.starrocks.scheduler.TaskRun.executeTaskRun(TaskRun.java:272)
	at com.starrocks.scheduler.TaskRunExecutor.lambda$executeTaskRun$0(TaskRunExecutor.java:58)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:829)

StarRocks version (Required)

3.3.5-8fold-1210-aa794ea

@njupudi-eightfold njupudi-eightfold added the type/bug Something isn't working label Dec 20, 2024
@LiShuMing LiShuMing linked a pull request Dec 24, 2024 that will close this issue
24 tasks
@LiShuMing
Copy link
Contributor

This is because our partition name are always case-insensitive, but automatic partitions in be will create case-sensitive partitions.

mysql> insert into test_base_tbl values
    -> ('a.com', 'val1', 100),
    -> ('A.com', 'val1', 200),
    -> ('A.Com', 'val1', 300);
ERROR 1064 (HY000): Insert has filtered data, txn_id = 19015, tracking sql = select tracking_log from information_schema.load_tracking_logs where job_id=30034
mysql> select tracking_log from information_schema.load_tracking_logs where job_id=30034
    -> ;
+------------------------------------------------------------------------------------------------------+
| tracking_log                                                                                         |
+------------------------------------------------------------------------------------------------------+
| Error: The row is out of partition ranges. Please add a new partition.. Row: ['A.com', 'val1', 200]
 |
+------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

mysql> show partitions from test_base_tbl;
+-------------+---------------+----------------+---------------------+--------------------+--------+--------------+-------------+-----------------+---------+----------------+---------------+---------------------+--------------------------+----------+------------+----------+-------------+--------------------+----------------+
| PartitionId | PartitionName | VisibleVersion | VisibleVersionTime  | VisibleVersionHash | State  | PartitionKey | List        | DistributionKey | Buckets | ReplicationNum | StorageMedium | CooldownTime        | LastConsistencyCheckTime | DataSize | IsInMemory | RowCount | DataVersion | VersionEpoch       | VersionTxnType |
+-------------+---------------+----------------+---------------------+--------------------+--------+--------------+-------------+-----------------+---------+----------------+---------------+---------------------+--------------------------+----------+------------+----------+-------------+--------------------+----------------+
| 30036       | pa2ecom       | 1              | 2024-12-24 13:38:57 | 0                  | NORMAL | col1         | [["a.com"]] | col1            | 5       | 1              | HDD           | 9999-12-31 23:59:59 | NULL                     | 0B       | false      | 0        | 1           | 329634415550398464 | TXN_NORMAL     |
| 30060       | pA2eCom       | 1              | 2024-12-24 13:38:57 | 0                  | NORMAL | col1         | [["A.Com"]] | col1            | 5       | 1              | HDD           | 9999-12-31 23:59:59 | NULL                     | 0B       | false      | 0        | 1           | 329634415550398466 | TXN_NORMAL     |
| 30048       | pA2ecom       | 1              | 2024-12-24 13:38:57 | 0                  | NORMAL | col1         | [["A.com"]] | col1            | 5       | 1              | HDD           | 9999-12-31 23:59:59 | NULL                     | 0B       | false      | 0        | 1           | 329634415550398465 | TXN_NORMAL     |
+-------------+---------------+----------------+---------------------+--------------------+--------+--------------+-------------+-----------------+---------+----------------+---------------+---------------------+--------------------------+----------+------------+----------+-------------+--------------------+----------------+
3 rows in set (0.01 sec)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants