4

I'm getting a java.lang.OutOfMemoryError exception: Java heap space.

I'm parsing a XML file, storing data and outputting a XML file when the parsing is complete.

I'm bit surprised to get such error, because the original XML file is not long at all.

Code: http://d.pr/RSzp File: http://d.pr/PjrE

asked Mar 24, 2011 at 17:00
3
  • It is likely you have a bug in your program, try doing a heap dump on OOM error for a modest maximum memory size (so the dump is not so large) Commented Mar 24, 2011 at 17:17
  • [I am still waiting for an answer that gives suggestions for live memory analysis/monitoring. A (simple) tool can go a long way. There are some SO questions that cover just this.] Commented Mar 24, 2011 at 17:40
  • See also (search "java profile memory"): stackoverflow.com/questions/5349062/… stackoverflow.com/questions/756873/… stackoverflow.com/questions/46642/… Commented Mar 24, 2011 at 17:46

7 Answers 7

2

Short answer to explain why you have an OutOfMemoryError, for every centroid found in the file you loop over the already "registered" centroids to check if it is already known (to add a new one or to update the already registered one). But for every failed comparison you add a new copy of the new centroid. So for every new centroid it add it as many times as there are already centroids in the list then you encounter the first one you added, you update it and you leave the loop...

Here is some refactored code:

public class CentroidGenerator {
 final Map<String, Centroid> centroids = new HashMap<String, Centroid>();
 public Collection<Centroid> getCentroids() {
 return centroids.values();
 }
 public void nextItem(FlickrDoc flickrDoc) {
 final String event = flickrDoc.getEvent();
 final Centroid existingCentroid = centroids.get(event);
 if (existingCentroid != null) {
 existingCentroid.update(flickrDoc);
 } else {
 final Centroid newCentroid = new Centroid(flickrDoc);
 centroids.put(event, newCentroid);
 }
 }
 public static void main(String[] args) throws IOException, SAXException {
 // instantiate Digester and disable XML validation
 [...]
 // now that rules and actions are configured, start the parsing process
 CentroidGenerator abp = (CentroidGenerator) digester.parse(new File("PjrE.data.xml"));
 Writer writer = null;
 try {
 File fileOutput = new File("centroids.xml");
 writer = new BufferedWriter(new FileWriter(fileOutput));
 writeOuput(writer, abp.getCentroids());
 } catch (FileNotFoundException e) {
 e.printStackTrace();
 } catch (IOException e) {
 e.printStackTrace();
 } finally {
 try {
 if (writer != null) {
 writer.close();
 }
 } catch (IOException e) {
 e.printStackTrace();
 }
 }
 }
 private static void writeOuput(Writer writer, Collection<Centroid> centroids) throws IOException {
 writer.append("<?xml version='1.0' encoding='utf-8'?>" + System.getProperty("line.separator"));
 writer.append("<collection>").append(System.getProperty("line.separator"));
 for (Centroid centroid : centroids) {
 writer.append("<doc>" + System.getProperty("line.separator"));
 writer.append("<title>" + System.getProperty("line.separator"));
 writer.append(centroid.getTitle());
 writer.append("</title>" + System.getProperty("line.separator"));
 writer.append("<description>" + System.getProperty("line.separator"));
 writer.append(centroid.getDescription());
 writer.append("</description>" + System.getProperty("line.separator"));
 writer.append("<time>" + System.getProperty("line.separator"));
 writer.append(centroid.getTime());
 writer.append("</time>" + System.getProperty("line.separator"));
 writer.append("<tags>" + System.getProperty("line.separator"));
 writer.append(centroid.getTags());
 writer.append("</tags>" + System.getProperty("line.separator"));
 writer.append("<geo>" + System.getProperty("line.separator"));
 writer.append("<lat>" + System.getProperty("line.separator"));
 writer.append(centroid.getLat());
 writer.append("</lat>" + System.getProperty("line.separator"));
 writer.append("<lng>" + System.getProperty("line.separator"));
 writer.append(centroid.getLng());
 writer.append("</lng>" + System.getProperty("line.separator"));
 writer.append("</geo>" + System.getProperty("line.separator"));
 writer.append("</doc>" + System.getProperty("line.separator"));
 }
 writer.append("</collection>" + System.getProperty("line.separator") + System.getProperty("line.separator"));
 }
 /**
 * JavaBean class that holds properties of each Document entry. It is important that this class be public and
 * static, in order for Digester to be able to instantiate it.
 */
 public static class FlickrDoc {
 private String id;
 private String title;
 private String description;
 private String time;
 private String tags;
 private String latitude;
 private String longitude;
 private String event;
 public void setId(String newId) {
 id = newId;
 }
 public String getId() {
 return id;
 }
 public void setTitle(String newTitle) {
 title = newTitle;
 }
 public String getTitle() {
 return title;
 }
 public void setDescription(String newDescription) {
 description = newDescription;
 }
 public String getDescription() {
 return description;
 }
 public void setTime(String newTime) {
 time = newTime;
 }
 public String getTime() {
 return time;
 }
 public void setTags(String newTags) {
 tags = newTags;
 }
 public String getTags() {
 return tags;
 }
 public void setLatitude(String newLatitude) {
 latitude = newLatitude;
 }
 public String getLatitude() {
 return latitude;
 }
 public void setLongitude(String newLongitude) {
 longitude = newLongitude;
 }
 public String getLongitude() {
 return longitude;
 }
 public void setEvent(String newEvent) {
 event = newEvent;
 }
 public String getEvent() {
 return event;
 }
 }
 public static class Centroid {
 private final String event;
 private String title;
 private String description;
 private String tags;
 private Integer time;
 private int nbTimeValues = 0; // needed to calculate the average later
 private Float latitude;
 private int nbLatitudeValues = 0; // needed to calculate the average later
 private Float longitude;
 private int nbLongitudeValues = 0; // needed to calculate the average later
 public Centroid(FlickrDoc flickrDoc) {
 event = flickrDoc.event;
 title = flickrDoc.title;
 description = flickrDoc.description;
 tags = flickrDoc.tags;
 if (flickrDoc.time != null) {
 time = Integer.valueOf(flickrDoc.time.trim());
 nbTimeValues = 1; // time is the sum of one value
 } 
 if (flickrDoc.latitude != null) {
 latitude = Float.valueOf(flickrDoc.latitude.trim());
 nbLatitudeValues = 1; // latitude is the sum of one value
 }
 if (flickrDoc.longitude != null) {
 longitude = Float.valueOf(flickrDoc.longitude.trim());
 nbLongitudeValues = 1; // longitude is the sum of one value
 }
 }
 public void update(FlickrDoc newData) {
 title = title + " " + newData.title;
 description = description + " " + newData.description;
 tags = tags + " " + newData.tags;
 if (newData.time != null) {
 nbTimeValues++;
 if (time == null) {
 time = 0;
 }
 time += Integer.valueOf(newData.time.trim());
 }
 if (newData.latitude != null) {
 nbLatitudeValues++;
 if (latitude == null) {
 latitude = 0F;
 }
 latitude += Float.valueOf(newData.latitude.trim());
 }
 if (newData.longitude != null) {
 nbLongitudeValues++;
 if (longitude == null) {
 longitude = 0F;
 }
 longitude += Float.valueOf(newData.longitude.trim());
 }
 }
 public String getTitle() {
 return title;
 }
 public String getDescription() {
 return description;
 }
 public String getTime() {
 if (nbTimeValues == 0) {
 return null;
 } else {
 return Integer.toString(time / nbTimeValues);
 }
 }
 public String getTags() {
 return tags;
 }
 public String getLat() {
 if (nbLatitudeValues == 0) {
 return null;
 } else {
 return Float.toString(latitude / nbLatitudeValues);
 }
 }
 public String getLng() {
 if (nbLongitudeValues == 0) {
 return null;
 } else {
 return Float.toString(longitude / nbLongitudeValues);
 }
 }
 public String getEvent() {
 return event;
 }
 }
}
answered Mar 25, 2011 at 12:50

1 Comment

BTW looking at your code, your average methods are also wrong: if you have 4, 4, 4, 4, 4, 4, 4, 4, 2 your will return 3 as the average...
2

Could try setting the (I'm assuming your using Eclipse) -Xms and -Xmx values higher in your eclipse.ini file.

ex)

-vmargs

-Xms128m //(initial heap size)

-Xmx256m //(max heap size)

answered Mar 24, 2011 at 17:05

4 Comments

No, I'm from terminal: java -vmargs -Xms128m -Xmx256m -cp .:jars/* CentroidGenerator data/data.xml --- Unrecognized option: -vmargs Could not create the Java virtual machine.
By the way, I've already tried java -Xms128m -Xmx1024m -cp .:jars/* CentroidGenerator data/data.xml and I get the same error
setting the values in eclipse.ini only affects eclipse itself, not the Java programs that it launches. Your direction of setting -Xmx is correct, this is done in the launch configuration of the program, inside Eclipse
@Yoni I'm not using Eclise, I'm just running my app from the terminal and I'm using the suggested parameters, but it doesn't work, I still get out of memory error
2

If this is a one-off thing that you just want to get done, I'd try Jason's advice of increasing the memory available to Java.

You are building a very large list of objects and then looping through that list to output a String, then writing that String to a file. The list and the String are probably the reasons for your high memory usage. You could reorganise your code in a more stream-oriented way. Open your file output at the start, then write the XML for each Centroid as they are parsed. Then you wouldn't need to keep a big list of them, and you wouldn't need to hold a big String representing all the XML.

answered Mar 24, 2011 at 17:10

1 Comment

yeah I got your answer, the only issue is that the centroids are actually updated progressively when new related items are discovered in the source file. In other terms, the first centroid might need to be updated successively and for this reason I cannot just print and release it.
2

Dump the heap and analyze it. You can configure automatic heap dump on memory error using -XX:+HeapDumpOnOutOfMemoryError system property.

http://www.oracle.com/technetwork/java/javase/index-137495.html

https://www.infoq.com/news/2015/12/OpenJDK-9-removal-of-HPROF-jhat

(削除) http://blogs.oracle.com/alanb/entry/heap_dumps_are_back_with (削除ここまで)

answered Mar 24, 2011 at 17:12

1 Comment

well, that blog is gone, and I don't see the cached version article around anymore. Linking another article that's giving the basic commands for current Java, and also explains the changes in Java 9
0

Answering the question "How to Debug"

It starts with gathering the information that's missing from your post. Information that could potentially help future people having the same problem.

First, the complete stack trace. An out-of-memory exception that's thrown from within the XML parser is very different from one thrown from your code.

Second, the size of the XML file, because "not long at all" is completely useless. Is it 1K, 1M, or 1G? How many elements.

Third, how are you parsing? SAX, DOM, StAX, something completely different?

Fourth, how are you using the data. Are you processing one file or multiple files? Are you accidentally holding onto data after parsing? A code sample would help here (and a link to some 3rd-party site isn't terribly useful for future SO users).

answered Mar 24, 2011 at 17:29

1 Comment

1. ok 2. I've posted the file, it is 328KB 3-4. I've posted the code, Commons-digester is the parser
0

Ok, I'll admit I'm avoiding your direct question with a possible alternative. You might want to consider parsing with XStream instead to let it deal with the bulk of the work with less code. My rough example below parses your XML with a 64MB heap. Note that it requires Apache Commons IO as well just to easily read the input just to allow the hack to turn the <collection> into a <list>.

import java.io.File;
import java.io.IOException;
import java.util.List;
import org.apache.commons.io.FileUtils;
import com.thoughtworks.xstream.XStream;
import com.thoughtworks.xstream.annotations.XStreamAlias;
public class CentroidGenerator {
 public static void main(String[] args) throws IOException {
 for (Centroid centroid : getCentroids(new File("PjrE.data.xml"))) {
 System.out.println(centroid.title + " - " + centroid.description);
 }
 }
 @SuppressWarnings("unchecked")
 public static List<Centroid> getCentroids(File file) throws IOException {
 String input = FileUtils.readFileToString(file, "UTF-8");
 input = input.replaceAll("collection>", "list>");
 XStream xstream = new XStream();
 xstream.processAnnotations(Centroid.class);
 Object output = xstream.fromXML(input);
 return (List<Centroid>) output;
 }
 @XStreamAlias("doc")
 @SuppressWarnings("unused")
 public static class Centroid {
 private String id;
 private String title;
 private String description;
 private String time;
 private String tags;
 private String latitude;
 private String longitude;
 private String event;
 private String geo;
 }
}
answered Mar 24, 2011 at 17:31

2 Comments

mhm ok, but what's the problem with commons-digester ? Isn't just a parser like the other ones ? thanks
@Patrick: that's a good question that I don't have an answer to. It's either inefficient or your code is doing something that requires a lot of memory. Skimming over the code I don't see anything obvious though, seems really odd that it would need more than 1GB. You might want to try a profiler like YourKit to see what's consuming so much memory. Or just get a memory dump with jmap -histo.
0

I downloaded your code, something that I almost never do. And I can say with 99% certainty that the bug is in your code: an incorrect "if" inside a loop. It has nothing whatsoever to do with Digester or XML. Either you've made a logic error or you didn't fully think through just how many objects you'd create.

But guess what: I'm not going to tell you what your bug is.

If you can't figure it out from the few hints that I've given above, too bad. It's the same situation that you put all of the other respondents through by not providing enough information -- in the original post -- to actually start debugging.

Perhaps you should read -- actually read -- my former post, and update your question with the information it requests. Or, if you can't be bothered to do that, accept your F.

answered Mar 25, 2011 at 12:39

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.