필터 지우기
필터 지우기

Big Data analysis using Matlab and Database connection, is it possible ?

조회 수: 2 (최근 30일)
Hi,
I have a code that analyze large data in a database like follows:
for i=1:N %N is large
A=exec(con,query);*%retrieving ONLY one row of data from a database
A=fetch(A);
A=A.data;
... %other calculations
end
It always stall [GC/memory error] after few hours. I already tried both of these methods to get around it:
  1. Java.opts solution
  2. JheapCl Solution [Garbage Collector]
But still have no luck. I already increase my Java Heap memory to 8Gb and my machine memory is quite large 32Gb. Furthermore I only retrieve one row at a time in each iteration and use the same variable, therefore I don't think it's because of memory insufficiency.
Can anybody help me with this issue, any help will be greatly appreciated. Thanks.

채택된 답변

Yair Altman
Yair Altman 2013년 10월 15일
편집: Yair Altman 2013년 10월 15일
1. Try to disconnect from the DB every now and then, ensuring that all references are explicitly cleared so that there's no dangling references out there that cannot be GC'ed. After disconnecting and before you reconnect, perform a manual GC (JheapCl makes this easy, or run the simpler java.lang.System.gc).
2. Try to fetch data in bulks rather than in separate rows. This would improve performance as well as decrease resources.
3. Ensure that you're connecting to the DB directly via JDBC rather than an ODBC bridge
Yair Altman
  댓글 수: 1
Taufik Sutanto
Taufik Sutanto 2013년 10월 16일
Thank you for the suggestions. I connect directly using the JDBC connection. I tried retrieving 1000 rows per query and closing the connection after 10 queries. In between closing and re-openning the connection I was trying to clear the heap and even the workspace. I tried the following:
NN=ceil(N/blok_data);
for i=1:NN
a_function(con1,x,y,z,start,Blok_data);
%the function retrieve data from database & process it
start=start+blok_data;
if mod(i,10)==0
close(con1); clear con1; warning off;
save('c:\tmp\RBCC_Data.mat');
clear all; clear java; jheapcl();warning on;
load('c:\tmp\RBCC_Data.mat');
con1=connection_function(usr,db,passwd);
end
end
It helps delay the error, but Matlab still not responding after about 5 million calls. [I am hoping it can go up to at least 10 million calls].
btw, do you know why even after I close the connection and clear the connection variable I got warning message "the Jdbc not serializable" when trying to re-connect? [hence my "warning off" line in above code]

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Matrix Indexing에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by