파서 디퍼렌셜 이용 방법

Source: GitLab Blog | Author: Joern Schneeweisz

마이크로서비스 기반 아키텍처로 이전하면 사악한 행위자에게 더 많은 공격 표면이 생성됩니다. 보안 연구원들이 GitLab 내에서 파일 업로드에 대한 취약성을 발견한 다음에 GitLab 12.7.4 보안 릴리즈에서 그것을 바로 해결했습니다. 우리는 이 취약성을 초래하는 문제를 더 깊이 파고들어서 파서 디퍼렌셜의 기본 개념을 설명하는데 사용합니다.

File Uploads in GitLab

To understand the file upload vulnerability we need to go a bit deeper into file uploads within GitLab, and have a look at the involved components.

GitLab Workhorse

The first relevant component is GitLab’s very own reverse proxy called gitlab-workhorse.gitlab-workhorse fulfills a variety of tasks, but for this specific example we only care about certain kinds of file uploads.

The second component is gitlab-rails, the Ruby on Rails-based heart of GitLab. It’s the main application part of GitLab and implements most of the business logic.

The following source code excerpts from gitlab-workhorse are based on the 8.18.0 release which was the most recent version at the time of identifying the vulnerability.

Consider the following route, defined in internal/upstream/routes.go, which handles file uploads for Conan packages:

// Conan Artifact Repository
route("PUT", apiPattern+`v4/packages/conan/`, filestore.BodyUploader(api, proxy, nil)),

The route defined above will pass any PUT request to paths underneath /api/v4/packages/conan/ to the BodyUploader. Within this BodyUploader now some magic happens. Well, actually, it’s not magic, the BodyUploader receives the uploaded file and lets the gitlab-rails backend know where the file has been placed. This happens in internal/filestore/file_handler.go.

Also worth mentioning: Any not-matched routes in gitlab-workhorse will be passed on to the backend without modification. That’s especially important in our discussion for non-PUT routes under /api/v4/packages/conan/.

// GitLabFinalizeFields returns a map with all the fields GitLab Rails needs in order to finalize the upload.
func (fh *FileHandler) GitLabFinalizeFields(prefix string) map[string]string {
	data := make(map[string]string)
	key := func(field string) string {
		if prefix == "" {
			return field

		return fmt.Sprintf("%s.%s", prefix, field)
	if fh.Name != "" {
		data[key("name")] = fh.Name
	if fh.LocalPath != "" {
		data[key("path")] = fh.LocalPath
	if fh.RemoteURL != "" {
		data[key("remote_url")] = fh.RemoteURL
	if fh.RemoteID != "" {
		data[key("remote_id")] = fh.RemoteID
	data[key("size")] = strconv.FormatInt(fh.Size, 10)
	for hashName, hash := range fh.hashes {
		data[key(hashName)] = hash
 very popular in
	return data

So gitlab-workhorse will replace the uploaded file name by the path to where it has stored the file on disk, such that the gitlab-rails backend knows where to pick it up.

Observe the following original request, as received by gitlab-workhorse:

PUT /api/v4/packages/conan/v1/files/Hello/0.1/root+xxxxx/beta/0/export/ HTTP/1.1
Host: localhost
User-Agent: Conan/1.22.0 (Python 3.8.1) python-requests/2.22.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: close
X-Checksum-Sha1: 93ebaf6e85e8edde99c1ed46eaa1b5e1e5f4ac78
Content-Length: 1765
Authorization: Bearer [.. shortened ..]

from conans import ConanFile, CMake, tools

class HelloConan(ConanFile):
    name = "Hello"
[.. shortened ..]

This is what this request will look like to gitlab-rails after gitlab-workhorse has processed it (excerpted from api_json.log):

  "time": "2020-02-20T14:49:44.738Z",
  "severity": "INFO",
  "duration": 201.93,
  "db": 67.34,
  "view": 134.59,
  "status": 200,
  "method": "PUT",
  "path": "/api/v4/packages/conan/v1/files/Hello/0.1/root+xxxxx/beta/0/export/",
  "params": [
      "key": "file.md5",
      "value": "719f0319f1fd5f6fcbc2433cc0008817"
      "key": "file.path",
      "value": "/var/opt/gitlab/gitlab-rails/shared/packages/tmp/uploads/582573467"
      "key": "file.sha1",
      "value": "93ebaf6e85e8edde99c1ed46eaa1b5e1e5f4ac78"
      "key": "file.sha256",
      "value": "f7059b223cd4d32002e5e34ab1ae5b4ea12f3bd0326589b00d5e910ce02c1f3a"
      "key": "file.sha512",
      "value": "efbe75ea58bd817d42fd9ca5ac556abd6fbe3236f66dfad81d508b5860252d32d1b1868ee03c7f4c6174a0ba6cc920a574b5865ca509f36c451113c9108f9a36"
      "key": "file.size",
      "value": "1765"
  "host": "localhost",
  "remote_ip": ",",
  "ua": "Conan/1.22.0 (Python 3.8.1) python-requests/2.22.0",
  "route": "/api/:version/packages/conan/v1/files/:package_name/:package_version/:package_username/:package_channel/:recipe_revision/export/:file_name",
  "user_id": 1,
  "username": "root",
  "queue_duration": 16.59,
  "correlation_id": "aSEqrgEfvX9"

In particular, the params entry file.path is of interest, as it denotes the file system path where gitlab-workhorse has placed the uploaded file.


This gitlab-workhorse-modified request, as gitlab-rails will see it, is handled in lib/uploaded_file.rb within the from_params method:

01  def self.from_params(params, field, upload_paths)
02    path = params["#{field}.path"]
03    remote_id = params["#{field}.remote_id"]
04    return if path.blank? && remote_id.blank?
06    file_path = nil
07    if path
08      file_path = File.realpath(path)
10      paths = Array(upload_paths) << Dir.tmpdir
11      unless self.allowed_path?(file_path, paths.compact)
12        raise InvalidPathError, "insecure path used '#{file_path}'"
13      end
14    end
17      filename: params["#{field}.name"],
18      content_type: params["#{field}.type"] || 'application/octet-stream',
19      sha256: params["#{field}.sha256"],
20      remote_id: remote_id,
21      size: params["#{field}.size"])
22  end

We can see here the handling of the uploaded file reference. The part in line 10-13 in the snippet above implements a whitelist of a specific set of paths from where a gitlab-workhorse uploaded file will be accepted.Dir.tmpdir which resolves to the path /tmp is added to the whitelist as well. In the subsequent lines a new UploadedFile is constructed from the file.path and other parameters gitlab-workhorse has set.

gitlab-workhorse bypass

So we’ve seen the inner workings of both gitlab-workhorse and gitlab-rails when it comes to file uploads for Conan packages. In recap it would go as follows:UserworkhorseRailsPUT request to conan registryPlace uploaded file on disk and re-write PUT requestPass on modified PUT requestPick up file from disk and store in UploadedFileUserworkhorseRails

From an attacker perspective it would be nice to meddle with the modified PUT request, especially control over the file.path parameter would allow us to grab arbitrary files from /tmp and the defined upload_paths. But as gitlab-workhorse sits right in front of gitlab-rails we can’t just pass those parameters or otherwise interact directly with gitlab-rails without going via gitlab-workhorse.

We can indeed achieve this by leveraging the fact that gitlab-workhorse parses the HTTP requests in a different way than gitlab-rails does. In particular, we can use Rack::MethodOverride in gitlab-rails which is a default middleware in Ruby on Rails applications. The Rack::MethodOverride middleware allows us to send a POST request and let gitlab-rails know “well, actually this is a PUT request! ¯\_(ツ)_/¯ “. With this little trick we can sneak past the gitlab-workhorse route which would intercept the PUT request, as gitlab-workhorse is not aware of the overridden POST method. So by specifying either a _method=PUT parameter or a X-HTTP-METHOD-OVERRIDE: PUT HTTP header we can indeed directly point gitlab-rails to files on disk. The method override is used a lot in Ruby on Rails applications to allow simple <form> based POST requests to use other REST-based methods like PUT and DELETE by overriding the <form>POST request with the _method parameter.

So a POST request to the right Conan endpoint with a file.path and file.size parameter will do the trick. A full request using this bypass would look like this:

POST /api/v4/packages/conan/v1/files/Hello/0.1/lol+wat/beta/0/export/conanmanifest.txt?file.size=4&file.path=/tmp/test1234 HTTP/1.1
Host: localhost
User-Agent: Conan/1.21.0 (Python 3.8.1) python-requests/2.22.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: close
X-HTTP-Method-Override: PUT
X-Checksum-Deploy: true
X-Checksum-Sha1: ee96149f7b93af931d4548e9562484bdb6ac8fda
Content-Length: 4
Authorization: Bearer [.. shortened ..]


This would, instead of uploading a file, let us get a hold of the file /tmp/test1234 from the GitLab server’s file system. In recap, the flow to exploit this issue looks as follows:UserworkhorseRailsPOST request to conan registryRoute does not match anythingPass on unmodified POST requestInterpret as PUT and pick up file from diskUserworkhorseRails

We fixed this issue within gitlab-workhorse by signing Requests which pass gitlab-workhorse, the signature then is verified on the gitlab-rails side

How parser differentials can introduce vulnerabilities

Let’s take a huge step back and see from an high-level perspective what just happened. We’ve had gitlab-workhorse and gitlab-rails both looking at a POST request. But gitlab-rails ultimately saw a PUT request due to the overridden HTTP method.

What occurred here is a case of a parser differential, as gitlab-workhorse and gitlab-rails parsed the incoming HTTP request differently. The term parser differential originates from the Language-theoretic Security approach. It denotes the fact that two (or more) different parsers “understand” the very same message in a different way. Or, as described in the LangSec handout as follows:

Different interpretation of messages or data streams by components breaks any assumptions that components adhere to a shared specification and so introduces inconsistent state and unanticipated computation.

Indeed such issues and the consequential unanticipated computation get more and more common when we look at modern web environments. The days of web applications being a stand-alone bunch of scripts invoked on a web server are long gone. The rise of microservices leads to complex environments and the very same message (or HTTP request) might be interpreted by several different services in several different ways. Just as shown in the above example this sometimes comes along with security implications.

From the point of view of a pragmatic bug hunter, the idea of parser differentials is very interesting as those issue can yield unique security bugs. Consider, for instance, this RCE in couchdb. Also the HTTP desync attack technique, which has gotten a lot attention in the bug bounty community, is a matter of parser differentials.

For the developer perspective we need to be aware of other components and their parsing behavior in order to avoid security issues which arise from interpreting the same message differently.

Cover Photo by Marta Branco on Pexels

댓글 남기기