When you’re integrating hadoop and other big-data frameworks into AWS s3, you will quickly run into a situation where you need to include the hadoop-aws and aws-java-sdk-bundle JARs into your class path.
Unfortunately, these JARs are separately versioned and it is hard to figure out compatibility. The hadoop-aws JAR has to match your hadoop version exactly, so that one is fine.
Determining the Right Version
- Check your hadoop version.
- Get the hadoop-aws.jar with the same exact version.
- Go to the maven central page for the correct version of the hadoop-aws.jar and look at its compile dependencies. E.g. at https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-aws/2.9 you can see the SDK dependency is com.amazonaws » aws-java-sdk-bundle 1.11.199.
Thanks for posting this. It is surprisingly un-obvious (to me anyway) until one understands that the the hadoop version decides the hadoop-aws version that decides the sdk bundle version