Error on loading excel file

nuvitu9999
Level 3
Error on loading excel file

I'm facing an issue while processing an excel file. The error log is very difficult to understand.

I hope that someone faced this issue before and help me solve it.

 

Error.png

 

 

 

 

 

 

 

 

 

 

 

 

[15:20:26] [DEBUG] [com.monitorjbl.xlsx.impl.StreamingWorkbookReader] - Deleting tmp file [/home/dataiku/dataiku/tmp/tmp-15330357160647418901.xlsx]
[15:20:26] [ERROR] [dku.input.push] - Push failed, cleanup resources
java.lang.NumberFormatException: For input string: "1e6"
at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.base/java.lang.Integer.parseInt(Integer.java:652)
at java.base/java.lang.Integer.parseInt(Integer.java:770)
at com.monitorjbl.xlsx.impl.StreamingSheetReader.handleEvent(StreamingSheetReader.java:118)
at com.monitorjbl.xlsx.impl.StreamingSheetReader.getRow(StreamingSheetReader.java:71)
at com.monitorjbl.xlsx.impl.StreamingSheetReader.access$200(StreamingSheetReader.java:32)
at com.monitorjbl.xlsx.impl.StreamingSheetReader$StreamingRowIterator.hasNext(StreamingSheetReader.java:402)
at com.dataiku.dip.formats.excel.ExcelFormatExtractor.doExtractStream(ExcelFormatExtractor.java:148)
at com.dataiku.dip.input.formats.ArchiveCapableFormatExtractor.extractSimple(ArchiveCapableFormatExtractor.java:154)
at com.dataiku.dip.input.formats.ArchiveCapableFormatExtractor.run(ArchiveCapableFormatExtractor.java:59)
at com.dataiku.dip.datasets.AbstractSingleThreadPusher.pushSplits(AbstractSingleThreadPusher.java:177)
at com.dataiku.dip.datasets.UniversalSingleThreadPusher.push(UniversalSingleThreadPusher.java:234)
at com.dataiku.dip.dataflow.exec.stream.SingleThreadFSLikeDatasetRunnable.run(SingleThreadFSLikeDatasetRunnable.java:71)
at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$FlowRunnableThread.run(ActivityRunner.java:374)
[15:20:26] [INFO] [dku.output.sql.pglike] - Aborting transaction
[15:20:26] [INFO] [dip.connection.share] - Give connection refCount=1
[15:20:26] [INFO] [dip.connection.share] - > closing connection with failure
[15:20:26] [DEBUG] [dku.connections.sql.provider] - Rollback conn=Dataiku_DB-bLHcHzb
[15:20:26] [DEBUG] [dku.connections.sql.provider] - Close conn=Dataiku_DB-bLHcHzb
[15:20:26] [DEBUG] [dku.resourceusage] - Reporting completion of CRU:{"context":{"type":"JOB_ACTIVITY","authIdentifier":"admin","projectKey":"POLICYDATA","jobId":"Build_OP01_New_prepared__NP__2021-12-10T08-17-38.256","activityId":"compute_OP01_New_prepared_NP","activityType":"recipe","recipeType":"shaker","recipeName":"compute_OP01_New_prepared"},"type":"SQL_CONNECTION","id":"sJbVBA5JxXDiClCt","startTime":1639124261558,"sqlConnection":{"connection":"Dataiku_DB"}}
[15:20:26] [INFO] [dku.flow.activity] - Run thread failed for activity compute_OP01_New_prepared_NP
java.lang.NumberFormatException: For input string: "1e6"
at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.base/java.lang.Integer.parseInt(Integer.java:652)
at java.base/java.lang.Integer.parseInt(Integer.java:770)
at com.monitorjbl.xlsx.impl.StreamingSheetReader.handleEvent(StreamingSheetReader.java:118)
at com.monitorjbl.xlsx.impl.StreamingSheetReader.getRow(StreamingSheetReader.java:71)
at com.monitorjbl.xlsx.impl.StreamingSheetReader.access$200(StreamingSheetReader.java:32)
at com.monitorjbl.xlsx.impl.StreamingSheetReader$StreamingRowIterator.hasNext(StreamingSheetReader.java:402)
at com.dataiku.dip.formats.excel.ExcelFormatExtractor.doExtractStream(ExcelFormatExtractor.java:148)
at com.dataiku.dip.input.formats.ArchiveCapableFormatExtractor.extractSimple(ArchiveCapableFormatExtractor.java:154)
at com.dataiku.dip.input.formats.ArchiveCapableFormatExtractor.run(ArchiveCapableFormatExtractor.java:59)
at com.dataiku.dip.datasets.AbstractSingleThreadPusher.pushSplits(AbstractSingleThreadPusher.java:177)
at com.dataiku.dip.datasets.UniversalSingleThreadPusher.push(UniversalSingleThreadPusher.java:234)
at com.dataiku.dip.dataflow.exec.stream.SingleThreadFSLikeDatasetRunnable.run(SingleThreadFSLikeDatasetRunnable.java:71)
at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$FlowRunnableThread.run(ActivityRunner.java:374)
[15:20:26] [INFO] [dku.flow.activity] running compute_OP01_New_prepared_NP - activity is finished
[15:20:26] [ERROR] [dku.flow.activity] running compute_OP01_New_prepared_NP - Activity failed
java.lang.NumberFormatException: For input string: "1e6"
at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.base/java.lang.Integer.parseInt(Integer.java:652)
at java.base/java.lang.Integer.parseInt(Integer.java:770)
at com.monitorjbl.xlsx.impl.StreamingSheetReader.handleEvent(StreamingSheetReader.java:118)
at com.monitorjbl.xlsx.impl.StreamingSheetReader.getRow(StreamingSheetReader.java:71)
at com.monitorjbl.xlsx.impl.StreamingSheetReader.access$200(StreamingSheetReader.java:32)
at com.monitorjbl.xlsx.impl.StreamingSheetReader$StreamingRowIterator.hasNext(StreamingSheetReader.java:402)
at com.dataiku.dip.formats.excel.ExcelFormatExtractor.doExtractStream(ExcelFormatExtractor.java:148)
at com.dataiku.dip.input.formats.ArchiveCapableFormatExtractor.extractSimple(ArchiveCapableFormatExtractor.java:154)
at com.dataiku.dip.input.formats.ArchiveCapableFormatExtractor.run(ArchiveCapableFormatExtractor.java:59)
at com.dataiku.dip.datasets.AbstractSingleThreadPusher.pushSplits(AbstractSingleThreadPusher.java:177)
at com.dataiku.dip.datasets.UniversalSingleThreadPusher.push(UniversalSingleThreadPusher.java:234)
at com.dataiku.dip.dataflow.exec.stream.SingleThreadFSLikeDatasetRunnable.run(SingleThreadFSLikeDatasetRunnable.java:71)
at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$FlowRunnableThread.run(ActivityRunner.java:374)
[15:20:26] [INFO] [dku.flow.activity] running compute_OP01_New_prepared_NP - Executing default post-activity lifecycle hook
[15:20:26] [INFO] [dku.flow.activity] running compute_OP01_New_prepared_NP - Removing samples for POLICYDATA.OP01_New_prepared
[15:20:26] [INFO] [dku.flow.activity] running compute_OP01_New_prepared_NP - Done post-activity tasks


Operating system used: Ubuntu


Operating system used: Ubuntu

0 Kudos
21 Replies
nuvitu9999
Level 3
Author

Data quality is not the problem. I tried to divide the file into 2 parts and it ran well