Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fill the table with few threads #325

Open
apb12 opened this issue Sep 11, 2024 · 8 comments
Open

Fill the table with few threads #325

apb12 opened this issue Sep 11, 2024 · 8 comments

Comments

@apb12
Copy link

apb12 commented Sep 11, 2024

this works

 public void writeTable(Table table, List<String> requiredColumns, List<DTO> indexList) {
        for (int i = 0; i < requiredColumns.size(); i++) {
         
            writeValue(table, indexList, requiredColumns.get(i), iI));
        }
    
    }    

this not works

public void writeTable(Table table, List<String> requiredColumns, List<DTO> indexList) {
        ExecutorService executorService = Executors.newFixedThreadPool(requiredColumns.size());
        for (int i = 0; i < requiredColumns.size(); i++) {
            int finalI = i;
            executorService.submit(() -> writeValue(table, indexList, requiredColumns.get(finalI), finalI));
        }
        executorService.shutdown();
        try {
            executorService.awaitTermination(20, TimeUnit.MINUTES);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }

commom method

    private void writeValue(Table table, List<DTO> indexList, String columnName, int columnNumber) {
        table.getCellByPosition(0, columnNumber).setStringValue(columnName);
        int rowNumber = 1;
        for (DTO index : indexList) {
            if (columnName.equals("id")) {
                table.getCellByPosition(rowNumber++, columnNumber).setStringValue(index.getId);
            }
            if (columnName.equals("name")) {
                table.getCellByPosition(rowNumber++, columnNumber).setStringValue(index.getName);
            }
            etc.....
        }
    }

If i do it in single thread it works, but it take too long - 14 columns 590k rows ,it take more then 30 mins..

ive tried to do it with code above in multithreading - each column will be writing in new thread and it works rly fast but file that i get is not readable.

Is there any chance to do it faster? My app is based on your lib. Please any idea

@mistmist
Copy link
Contributor

  1. ODFDOM isn't threadsafe, so the client needs to make sure it's only accessed by one thread at a time.

  2. ODFDOM architecture isn't really a good fit for processing "huge" spreadsheets, it's much better to use LibreOffice Calc for big spreadsheets, with its highly optimized C++ core... you can use it from Java via UNO API

  3. if you want to speed up ODFDOM, start by profiling where it's spending all the time - i bet it can be made substantially faster than it is currently, but it will never be performance competitive with Calc

@svanteschubert
Copy link
Contributor

svanteschubert commented Sep 11, 2024 via email

@apb12
Copy link
Author

apb12 commented Sep 11, 2024

svanteschubert thanks for this deep answer.
But im not sure for 1a. This code does not write data in same plase. All that it does fill the column completely from top to bottom
so if you have 14 columns 14 threads will fill all of them and every thread write only to same coumn. That means threads will never write in same cell, its impossible because every thread gets his own column number and works with it only.
thead 1 - get cell at position(0, 0) write data to it then going down and get cell at position(1, 0) then (2,0)
thead 2 - get cell at position(0, 1) write data to it then going down and get cell at position(1, 1) then (2,1)

and it does not work - file is not readable

if i do it in single thread there is no any issues and file is readable

so i think odf isnt threadsafe even your threads write to different coords of table
may be it works for rows, that mean first thread fill first 100k rows next one fill nex 100k and finally we will get table with 200k rows

@svanteschubert
Copy link
Contributor

svanteschubert commented Sep 11, 2024 via email

@apb12
Copy link
Author

apb12 commented Sep 11, 2024

svanteschubert i agreed with you, seems like we only can separate rows by threads not columns. I will try it out btw. Thx for answering.

@apb12
Copy link
Author

apb12 commented Sep 11, 2024

svanteschubert also want to say, that there is no any issue or exception, so debug do not show you anything helpfull on this case. Everything going good until you try to open this file xD

@svanteschubert
Copy link
Contributor

svanteschubert commented Sep 11, 2024 via email

@svanteschubert
Copy link
Contributor

svanteschubert commented Sep 12, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants