I am reading a single XML file of size- 2.6GB-- the size of JVM is 6GB.
However I am still getting a Heap Space out of memory error?
What am I doing wrong here...
For reference, I output the max memory and free memory properties of the JVM--
The max memory was shown as approx 5.6GB, but free memory was shown as only 90MB... Why is only 90MB being shown as free, esp. when I have not even started any processing... I have just started the program?
-
1What operating system are you using? Some have limits on how much memory one process can consume... i believe 32bit windows is max 2gb.Michael J. Lee– Michael J. Lee2012年12月28日 16:54:21 +00:00Commented Dec 28, 2012 at 16:54
-
42.6GB XML - OMG! Use a database!! Storing a XML File in memory will use much more space than the flat file on disk, because of all the node objects, child lists, attribute objects, etc.jlordo– jlordo2012年12月28日 16:54:40 +00:00Commented Dec 28, 2012 at 16:54
-
@jlordo - reading an XML file with SAX or into a DOM can be a perfectly appropriate thing to do. Depending on the requirements, a database could actually be the worst possible solution. IMHO...paulsm4– paulsm42012年12月28日 17:08:25 +00:00Commented Dec 28, 2012 at 17:08
-
1@paulsm4 I agree, it can be perfectly appropriate. A 2.6GB XML is will never be suitable for DOM representation, and only good for SAX in a few cases. At this amount of data a database is a good choice because of memory consumption, query and manipulation opportunities and access speed.jlordo– jlordo2012年12月28日 17:13:57 +00:00Commented Dec 28, 2012 at 17:13
4 Answers 4
In general, when converting structured text to the corresponding data structures in Java you need a lot more space than the size of the input file. There is a lot of overhead associated with the various data structures that are used, apart from the space required for the strings.
For example, each String
instance has an additional overhead of about 32-40 bytes - not to mention that each character is stored in two bytes, which effectively doubles the space requirements for ASCII-encoded XML.
Then you have additional overhead when storing the String in a structure. For example, in order to store a String
instance in a Map
you will need about 16-32 bytes of additional overhead, depending on the implementation and how you measure the usage.
It is quite possible that 6GB is just not enough to store a parsed 2.6GB XML file at once...
Bottom line:
If you are loading such a large XML file in memory (e.g. using a DOM parser) you are probably doing something wrong. A stream-based parser such as SAX should have far more modest requirements.
Alternatively consider transforming the XML file into a more usable file format, such as an embedded database - or even an actual server-based database. That would allow you to process far larger documents without issues.
5 Comments
You should avoid loading the entire xml into memory at once and instead use a specialized class that can deal with large amounts of xml.
1 Comment
There are potentially several different issues here.
But for starters:
1) If you're on a 64-bit OS, make sure you're using a 64-bit JVM
2) Make sure your code closes all resources you open as promptly as possible.
3) Explicitly set references to large objects you're done with to "null".
... AND ...
1 Comment
-Xmx6144m
, then they are actually using a 64-bit OS and JVM...You can't load a 2.6 GB XML image as a document with just 6 GB. As jhordo suggests, the ratio is more likely to be be 12 to 1. This is because every byte turns into a 16-bit character and every tag, attribute and value turns into a String with at least 32 bytes of overhead.
Instead what you should do is use a SAX or event based parser to process the file progressively. This way it will only keep as much data as you need to retain. If you can process everything in one pass, you won't need to retain anything.
Comments
Explore related questions
See similar questions with these tags.