20180325

Coursier resolution failing with HTTP method 416 in sbt

Or, how I made my first contribution to coursier
[trace] Stack trace suppressed: run last service/*:update for the full output.
[error] (service/*:update) coursier.ResolutionException: 1 download error
[error]     Caught java.io.IOException: 
    Server returned HTTP response code: 416 for URL: 
    https://repo1.maven.org/maven2/org/graylog2/gelfj/1.1.14/gelfj-1.1.14.jar 
    (Server returned HTTP response code: 416 for URL: 
https://repo1.maven.org/maven2/org/graylog2/gelfj/1.1.14/gelfj-1.1.14.jar) 
    while downloading 
    https://repo1.maven.org/maven2/org/graylog2/gelfj/1.1.14/gelfj-1.1.14.jar
I ran into this problem with sbt dependency resolution around 7 weeks ago. I was in a hurry, so I commented out the offending import (since it was not in the subproject I was working on, so was not needed for the run I was in) sent my commit to the heavens and CircleCI was happy.

I was not happy though. For the next weeks (sounds like a lot, but it was more like 2 commits), every time I had to work on this project I was commenting the import to get it to compile/run/test. Until I was fed up enough to check what the problem was.

What is response code 416

This is Range Not Satisfiable. This means that either the file we are requesting does not have this range available: it is shorter, or the range is malformed, or who knows. The expectation should be that under a 416 error, a full request is issued, but this was not the case with Coursier (I created the issue and submitted a PR to fix it, should be fixed in the next release).

So... Maven Central is not able to answer range queries? Some edge case is being hit? And, what are range queries used for? This one is easy, for resuming partial downloads.

The likely culprit was me (or my network) stopping sbt while it was fetching the libraries, at some point in the past… maybe. There is a partial download somewhere. Easy fix: clean up the partial download that is lying there, then update. Problem is, somewhere can be more than one place with coursier. Also, I wasn’t sure where the problem was coming from.

What to remove

In my case, I had to remove the coursier local cache at \~/.coursier/cache because this is where the partial download for gelfj was. But it might have been any of .ivy2/cache or .m2/cache. Maybe, even, sbt cache at .sbt/ . Or the ensime cache.

Reproducing, fixing

I managed to reproduce it relatively easily. Open any location in your .coursier/cache/v1 folder containing a JAR file. Move said jar to blah.jar.part. This way, it has full size as a partial download, and the requested range in a partial download request will be invalid (actually this was what was happening in my case: Coursier died just before moving .part to .jar). If you run sbt update under coursier on any project using this JAR, resolution will fail with a 416.

Fixing it was straightforward (the code base for coursier seems easy to search) by adding a check for this return code in the area that resets the connection if the returned headers are not valid.

To test the functionality I used sbt-plugins/publish-local to create a SNAPSHOT build I could set in my .sbt. Once I got the bug manually tested I ran the tests suites. I wanted to add a test, but this seemed untestable under the current suites, so I pushed and waited. The lead maintainer gave me some pointers on how to create a test using the current systems (I got very close but didn’t work in the end), and then, a PR by wisechengyi added several helpers for testing a PR he created in a similar situation, so I could add a test to mine. And done!
Written by Ruben Berenguel