During a fun debug session at work, Drew and I tracked down the cause of a misbehaving web app. The problem related to how Apache Tomcat decides to recompile JSPs and how some of our source code was branched and modified. Beware of modified JSPs in branches when you move to newer branches.
The problem appeared after a new version was deployed in production. The symptom was that a chunk of HTML on a page in a web app displayed on the QA servers, but not on the production servers. We confirmed it was the same warfile, the same 6.0.16 version of Tomcat and the same 1.5.14 version of the JDK. There were no errors in the app log or in catalina.out. Having seen errors in other apps recently in the localhost log file, I decided to look there. We found a message logged at SEVERE regarding a JspException from not finding a value on an object using operator “.”. So, this obviously indicated a JSP couldn’t be compiled because it referred to a non-existing field on a Java class.
So, then we hunted down the Java class that the Jasper compiler generated from the JSP and compared it between production and QA. We found that the Java code on production had an extra method that related to this missing property. So, even though we deployed the same warfile, the Java code for the JSP on the file system was different. By default, the code generated from JSPs ends up in tomcat/work/Catalina/localhost/{context}/org/apache/jsp/WEB_002dINF/jsp. Reverse engineering it to correlate the Java with the JSP code wasn’t as bad as I expected.
Next, we looked at the dependencies of the previous version and found that the class in question contained the field mentioned in the error message. Drew had switched to a newer branch for the new build, so the new file wasn’t strictly newer. And there’s the rub.
The timestamp of the class file in the older branch was newer than the timestamp of the class file in the newer branch, because it had been modified after it was branched.
When the app was deployed on the QA servers, the previous version had been undeployed. this caused the context directory for this app in the work directory to be deleted. When the app was deployed, Jasper compiled Java from all the JSPs as they were accessed.
However, on production, the new version was deployed as a replacement for the previous version. So, the context directory wasn’t deleted. When the JSP was accessed, Jasper saw that the timestamp for the previous version was newer, so it didn’t generate new Java code for the JSP file. However, the Java class the JSP page depended on had changed and the code no longer worked.
The quick fix was to undeploy the web app to let Tomcat clean up the work directory, and then to deploy it again.
However, it turned out that the change in the older branch was important. So, be sure to also diff the JSPs before moving to a new branch.