Troubleshooting SOAP and MTOM using the command line
When you want to transmit binary files over SOAP-based webservices, you have two choices: Base64 or Message Transmission Optimization Mechanism (MTOM). The latter is much more efficient, but also harder to troubleshoot if it doesn’t work at once.
Both options have their own typical scenario
- Serialise the file content using Base64 and include the result right into the XML structure. This is relatevely easy to implement and troubleshoot. It usually works well for small binary files, but as files grow larger, you may run into performance issues. The Base64-encoded binary file may be so big that the XML parser will blow up.
- Use MTOM to transfer the request and attachments. In this approach, the SOAP request (XML) and any attachments are sent as a multipart request. This is a bit harder to implement, and if it doesn’t work at once, it is even harder to troubleshoot. But on the other hand, it allows for much more efficient transportation of the attached files.
SOAP over HTTP
As a primer, let’s have a quick look at how “regular” SOAP requests look when transmitted over HTTP. Imagine we have a file upload webservice which accepts files in any format, along with a file name. If we were to upload a file, the request might look like this.
POST /ws/ HTTP/1.1
Host: localhost:8080
Content-Type: text/xml; charset=utf-8
Content-Length: 515
<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Header />
<SOAP-ENV:Body>
<ns1:uploadFileRequest xmlns:ns1="urn:example:Upload">
<ns1:name>red-square.png</ns1:name>
<ns1:content>UE5HDQoaCgAAAA1JSERSAAAACgAAAAoIAgAAAAJQWAAAAAlwSFlzAAALEwAACxMBABgAAAAHdElNRQcMEQcuClQ7ZQAAAB1pVFh0Q29tbWVudAAAAAAAQ3JlYXRlZCB3aXRoIEdJTVBkLmUHAAAAEklEQVQYYzAxSmMBAEEsARNlYhAzAAAAAElFTkRCYA==</ns1:content>
</ns1:uploadFileRequest>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>
Interesting detail here is that the file itself is just 156 bytes. The Base64-encoded version is 193 bytes. For small files, this is not a big difference. If we take a 3.2 MB PDF-file, its Base64-encoded counterpart grows to 4.3 MB. And a 251 MB gzipped TAR-archive grows to 335 MB when it is Base64-encoded.
MTOM requests
Now, let’s zoom in to the MTOM-approach. In this approach, the binary contents aren’t Base64-encoded and embedded into the XML structure. Instead, we use a multipart mime format where the first part is the XML structure and the following parts are the binary contents of attached files. It looks like so:
POST /ws/ HTTP/1.1
Host: localhost:8080
Content-Type: Multipart/Related; start="0968015446"; boundary="--=_Part_4_1959909680.1544697065790"
Content-Length: 728
----=_Part_4_1959909680.1544697065790
Content-Type: text/xml; charset=UTF-8
Content-ID: 0968015446
<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Header />
<SOAP-ENV:Body>
<ns1:uploadFileRequest xmlns:ns1="urn:example:Upload">
<ns1:name>red-square.png</ns1:name>
<ns1:Data>cid:962538495782</ns1:Data>
</ns1:uploadFileRequest>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>
----=_Part_4_1959909680.1544697065790
Content-Type: application/octet-stream
Content-ID: 962538495782
Content-Transfer-Encoding: binary
?PNG
IHDR
PX? pHYs
??tIME?
.
T?;eiTXtCommentCreated with GIMPd.eIDAT?c?π01?JcA,eb3IEND?B`?%
----=_Part_4_1959909680.1544697065790--
In the request, the file contents have disappeared.
Instead, we see a reference to one of the attached files.
Also, we see that the XML request and the attachments live in separate parts of the request.
The Content-Type
header has a few parameters:
- the
start
parameter identifies the SOAP request itself - the
boundary
parameter helps separate the different parts of the request.
As we can see, the total request is a bit larger. This is because of the overhead of extra headers and the like. Typically, for larger files (somewhere above 1000 bytes), it will be reversed, and the MTOM-approach will be more efficient.
Testing
Tooling like SoapUI (a popular choice for interacting with SOAP-based webservices) can help you interact with MTOM-based interfaces. They will setup all of this properly for you. But what if you want to troubleshoot your implementation, and the tooling doesn’t let you alter these things?
Using a simple shell script, you can interact with any MTOM-based web service. It will prepare the right headers for you, concatenate all the files (request and attachments) and finally upload that over HTTP.
#!/bin/bash
FILE_NAME=$1
# Prepare the headers for the XML request body
read -r -d '' REQUEST_HEADERS << EOM
----=_Part_4_1959909680.1544697065790
Content-Type: text/xml; charset=UTF-8
Content-ID: 0968015446
EOM
# Prepare the XML request body itself
read -r -d '' XML_REQUEST << EOM
<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Header />
<SOAP-ENV:Body>
<ns1:uploadFileRequest xmlns:ns1="urn:example:Upload">
<ns1:name>$FILE_NAME</ns1:name>
<ns1:Data>cid:962538495782</ns1:Data>
</ns1:uploadFileRequest>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>
EOM
# Prepare the headers for the attachment
read -r -d '' FILE_HEADERS << EOM
----=_Part_4_1959909680.1544697065790
Content-Type: application/octet-stream; name=${FILE_NAME}
Content-Transfer-Encoding: binary
Content-ID: <962538495782>
Content-Disposition: attachment; name="${FILE_NAME}"; filename="${FILE_NAME}"
EOM
# Prepare the terminating line
read -r -d '' TERMINATOR << EOM
----=_Part_4_1959909680.1544697065790--
EOM
# Create a temporary file
REQUEST_BODY=$(mktemp)
echo Using temporary file: ${REQUEST_BODY}
echo
# Stitch the request body together by concatenating all parts in the right order.
echo "$REQUEST_HEADERS" >> ${REQUEST_BODY}
# We use ANSI-C quoting for enforcing newlines: https://stackoverflow.com/a/5295906/1523342
echo $'\r\n\r\n' >> ${REQUEST_BODY}
echo "$XML_REQUEST" >> ${REQUEST_BODY}
echo $'\r\n' >> ${REQUEST_BODY}
echo "$FILE_HEADERS" >> ${REQUEST_BODY}
echo $'\r\n\r\n' >> ${REQUEST_BODY}
cat ${FILE_NAME} >> ${REQUEST_BODY}
echo "$TERMINATOR" >> ${REQUEST_BODY}
# Finally, upload the request body
# Based on & inspired by: https://stackoverflow.com/a/45289969/1523342
curl -v http://localhost:8080/ws \
-H 'Content-Type: multipart/related; type="text/xml"; start="0968015446"; boundary="--=_Part_4_1959909680.1544697065790"' \
-H 'MIME-Version: 1.0' \
-H 'SOAPAction: ""' \
--data-binary @${REQUEST_BODY}
# Remove the temporary file.
rm ${REQUEST_BODY}
Using this script, you can finally tweak around.
A client of your webservice skips the start
parameter of the Content-Type
?
No problem, you can mimick that and debug why it’s not working.
Another client doesn’t have the terminating multipart separator?
Same thing - just alter the script and attach a debugger to your program.
After that, invoke it with ./soap-mtom.sh red-square.png
.
For reference, you can find the red square here.
Happy hacking!