Unable to read a huge XML or text file
조회 수: 6 (최근 30일)
이전 댓글 표시
Hi,
I have a XML of 2GB in size. I keep getting java heap memory error when loading it. So I am thinking of reading it in as a text file and remove many useless rows in that file before saving it into a new and smaller file.
How to do that? I cannot even read it with textpad. Thanks!
답변 (1개)
Santa Raghavan
2017년 7월 26일
편집: Santa Raghavan
2017년 7월 26일
The amount of Java Heap memory available to MATLAB can be increased and this can be done in the following way:
In the MATLAB Desktop Window:
For versions of MATLAB R2010a and above, use - File -> Preferences -> General -> Java Heap Memory. Move the slider to adjust the allocated heap memory.
For versions of MATLAB prior to R2010a, refer to the link below-
If that does not work, you can read it in as a text file using the textscan function by specifying the block size you wish to read at a time.
fileID = fopen('bigfile.txt');
formatSpec = '%s %f %*f %*f %s';
Read a block of data in the file. Use the HeaderLines name-value pair argument to instruct textscan to skip two lines before reading data.
D = textscan(fileID,formatSpec,'HeaderLines',2,'Delimiter','\t')
댓글 수: 2
Santa Raghavan
2017년 7월 27일
You can also try the datastore function that lets you read files that dont fit into the memory.
ds = datastore('Myfile.xml', ...
'TreatAsMissing','NA')
ds.ReadSize = 100; % Specifies the number of lines
% you want to read at a time.
read(ds) % Reads first 100 lines in file
read(ds) % Reads next 100 lines in file
Subsequent read calls on ds fetches data from last read point.
참고 항목
카테고리
Help Center 및 File Exchange에서 Data Import and Export에 대해 자세히 알아보기
제품
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!