API : /sequence/:id
Important Points
- Servers may or may not support circular sequences.
- Servers may or may not support other encodings (JSON, fasta etc) but must support a response type of
text/vnd.ga4gh.refget.v1.0.0+plain
. - Client can query for a sub-sequence and the server MUST honour the request.
- An
Accept
header in the requests is optional, if not given default istext/vnd.ga4gh.refget.v1.0.0+plain
but response MUST have aContent-Type
header - Server may support redirection for sequence retrieval using
302
status code
These are possible success responses associated with this API.
Complete Sequence Queries
Case 1
Circular or Non-circular sequences Query parameters : NA Checksum Algorithm : MD5 Request Headers : Accept Description : Complete sequence will be retrieved regardless the type (circular or non-circular), encoding explicitly defined.
GET /sequence/6681ac2f62509cfc220d78751b8dc524/
Accept: text/vnd.ga4gh.refget.v1.0.0+plain
HTTP/1.1 200 OK
Content-Type: text/vnd.ga4gh.refget.v1.0.0+plain; charset=us-ascii
Content-Length: 230218
CCACA........GTGGG
Case 2
Circular or Non-circular Sequences Query parameters : NA Checksum Algorithm : MD5 Request Headers : None Description : Complete sequence will be retrieved regardless of the type (circular or non-circular), using the default encoding.
GET /sequence/6681ac2f62509cfc220d78751b8dc524/
HTTP/1.1 200 OK
Content-Type: text/vnd.ga4gh.refget.v1.0.0+plain; charset=us-ascii
Content-Length: 230218
CCACA........GTGGG
Case 3
Circular or Non-circular Sequences
Query parameters : NA
Checksum Algorithm : Truncated SHA512
Request Headers : Accept
Description : Complete sequence will be retrieved regardless of the type (circular or non-circular). Checksum algorithm must be supported by the server, otherwise server will result in a 404 Not Found
error.
GET /sequence/959cb1883fc1ca9ae1394ceb475a356ead1ecceff5824ae7/
Accept: text/vnd.ga4gh.refget.v1.0.0+plain
HTTP/1.1 200 OK
Content-Type: text/vnd.ga4gh.refget.v1.0.0+plain; charset=us-ascii
Content-Length: 230218
CCACA........GTGGG
Case 4
Circular or Non-circular Sequences
Query parameters : NA
Checksum Algorithm : Truncated SHA512
Request Header : Accept
Description : Redirects request to retrieve sequence from an alternative location (eg an AWS S3 bucket). Server will repspond with 302 Found
and client must follow the redirect.
GET /sequence/959cb1883fc1ca9ae1394ceb475a356ead1ecceff5824ae7/
Accept: text/vnd.ga4gh.refget.v1.0.0+plain
HTTP/1.1 302 Found
Location: s3.aws.com/bucketname/959cb1883fc1ca9ae1394ceb475a356ead1ecceff5824ae7
Sub-Sequence Queries
Using start / end query string parameters
Important Points:
- start is 0-based inclusive while end is 0-based exclusive
- start and end both are 32 bit unsigned integers
- start - end parameters must not be used along with
Range
- While using start - end, responses must have a
Accept-Ranges
header set to none. - CASE 4 of this section is only for servers which support circular sequences
Case 1
Circular or Non-circular Sequences Query parameters : start and end Checksum Algorithm : MD5 Request Headers : Accept Conditions : start < end ; start < size of sequence; Description : Sub sequence will be retrieved regardless of the type (circular or non-circular). Size of the sequence is 230218
GET /sequence/6681ac2f62509cfc220d78751b8dc524/?start=10&end=20
Accept: text/vnd.ga4gh.refget.v1.0.0+plain
HTTP/1.1 200 OK
Content-Type: text/vnd.ga4gh.refget.v1.0.0+plain; charset=us-ascii
Content-Length: 10
Accept-Ranges: none
CCCACACACC
GET /sequence/6681ac2f62509cfc220d78751b8dc524/?start=10&end=11
Accept: text/vnd.ga4gh.refget.v1.0.0+plain
HTTP/1.1 200 OK
Content-Type: text/vnd.ga4gh.refget.v1.0.0+plain; charset=us-ascii
Content-Length: 1
Accept-Ranges: none
C
GET /sequence/6681ac2f62509cfc220d78751b8dc524/?start=0&end=1
Accept: text/vnd.ga4gh.refget.v1.0.0+plain
HTTP/1.1 200 OK
Content-Type: text/vnd.ga4gh.refget.v1.0.0+plain; charset=us-ascii
Content-Length: 1
Accept-Ranges: none
C
GET /sequence/6681ac2f62509cfc220d78751b8dc524/?start=230217&end=230218
Accept: text/vnd.ga4gh.refget.v1.0.0+plain
HTTP/1.1 200 OK
Content-Type: text/vnd.ga4gh.refget.v1.0.0+plain; charset=us-ascii
Content-Length: 1
Accept-Ranges: none
G
Case 2
Circular or Non-circular Sequences Query parameters : start and end Checksum Algorithm : MD5 Request Headers : NA Conditions : start = end ; start < size of sequence; Description : Sub sequence of length 0 will be return (i.e. an empty string), as start is inclusive but end is exclusive.
GET /sequence/6681ac2f62509cfc220d78751b8dc524/?start=10&end=10
HTTP/1.1 200 OK
Content-Type: text/vnd.ga4gh.refget.v1.0.0+plain; charset=us-ascii
Content-Length: 0
Accept-Ranges: none
Case 3
Non-circular Sequences
Query parameters : Either start or end
Checksum Algorithm : MD5
Request Headers: NA
Conditions : Either start or end given; start < size of the sequence; end <= size of the sequence
Description : Sub sequence will be retrieved. If only start is given, end will be assumed to have a value equals to size of the sequence
. If only end is given, start will be assumed to have a value equals to 0
.
Size of the sequence is 230218
For example :
Sequence : ATGCATGCATGCATGC ; start = 1
Response : TGCATGCATGCATGC
Sequence : ATGCATGCATGCATGC ; end = 8 Response : ATGCATGC
GET /sequence/6681ac2f62509cfc220d78751b8dc524/?start=10
HTTP/1.1 200 OK
Content-Type: text/vnd.ga4gh.refget.v1.0.0+plain; charset=us-ascii
Content-Length: 230208
Accept-Ranges: none
CCCAC....GTGGG
GET /sequence/6681ac2f62509cfc220d78751b8dc524/?end=5
HTTP/1.1 200 OK
Content-Type: text/vnd.ga4gh.refget.v1.0.0+plain; charset=us-ascii
Content-Length: 5
Accept-Ranges: none
CCACA
When start = 0
GET /sequence/6681ac2f62509cfc220d78751b8dc524/?start=0
HTTP/1.1 200 OK
Content-Type: text/vnd.ga4gh.refget.v1.0.0+plain; charset=us-ascii
Content-Length: 230218
Accept-Ranges: none
CCACA......TGTGGG
When end = size of sequence
GET /sequence/6681ac2f62509cfc220d78751b8dc524/?end=230218
HTTP/1.1 200 OK
Content-Type: text/vnd.ga4gh.refget.v1.0.0+plain; charset=us-ascii
Content-Length: 230218
Accept-Ranges: none
CCACA......TGTGGG
GET /sequence/6681ac2f62509cfc220d78751b8dc524/?end=0
HTTP/1.1 200 OK
Content-Type: text/vnd.ga4gh.refget.v1.0.0+plain; charset=us-ascii
Content-Length: 0
Accept-Ranges: none
Case 4
Note : Only for servers which support circular sequence Circular Sequences Query parameters : start and end Checksum Algorithm : MD5 Request Headers: None Conditions : start > end ; start < size of sequence; end <= size of sequence Circular sequences must be supported by the server (This support is optional. Server will throw a Not Implemented error if support for circular sequences is not there,which will be covered in Error section) Description : Sub sequence will be retrieved, from start till the last byte of the sequence then immediately from first byte till the end. Size of the sequence is 5386 For example : Sequence : ATGCATGCATGCATGC ; start = 10 & end = 2 Response : GCATGC + AT -> GCATGCAT
GET /sequence/3332ed720ac7eaa9b3655c06f6b9e196/?start=5374&end=5
HTTP/1.1 200 OK
Content-Type: text/vnd.ga4gh.refget.v1.0.0+plain; charset=us-ascii
Content-Length: 17
Accept-Ranges: none
ATCCAACCTGCAGAGTT
Using Range Header
Notation:
Range: bytes=first-byte-spec - last-byte-spec
For example : Range: bytes=5-10
. Here 5 is first-byte-spec and 10 is last-byte-spec.
Important Points:
- Range header's unit will be bytes. first-byte-spec and last-byte-spec can be integer values only and last-byte-spec >= first-byte-spec MUST be True.
- first-byte-spec and last-byte-spec are both 0-based inclusive as opposed to start - end where end was exclusive.
- Sub-sequences of a circular sequences across the origin must not be requested via the Range header. Refer first point.
- More information can be found RFC 7233 Sec. 3
- If last-byte-spec equals or more than size of sequence, server MUST replace the value of last-byte-spec with (size - 1).
Case 1
Circular or Non-circular Sequences
Query parameters : NA
Checksum Algorithm : MD5
Request Header: Range
Conditions : first-byte-spec <= last-byte-spec < 'size - 1' (if first-byte-spec is 0, last-byte-spec can not be 'size - 1')
Description : Sub sequence will be retrieved regardless of the type (circular or non-circular).
Repons code should be 206
while using range header
Size of sequence is 230218
GET /sequence/6681ac2f62509cfc220d78751b8dc524/
Range: bytes=10-19
HTTP/1.1 206 Partial Content
Content-Type: text/vnd.ga4gh.refget.v1.0.0+plain; charset=us-ascii
Content-Length: 10
CCCACACACC
GET /sequence/6681ac2f62509cfc220d78751b8dc524/
Range: bytes=10-230217
HTTP/1.1 206 Partial Content
Content-Type: text/vnd.ga4gh.refget.v1.0.0+plain; charset=us-ascii
Content-Length: 230208
CCCAC.....GTGGG
GET /sequence/6681ac2f62509cfc220d78751b8dc524/
Range: bytes=10-99999999
HTTP/1.1 206 Partial Content
Content-Type: text/vnd.ga4gh.refget.v1.0.0+plain; charset=us-ascii
Content-Length: 230208
CCCAC.....GTGGG
GET /sequence/6681ac2f62509cfc220d78751b8dc524/
Range: bytes=0-0
HTTP/1.1 206 Partial Content
Content-Type: text/vnd.ga4gh.refget.v1.0.0+plain; charset=us-ascii
Content-Length: 1
C
GET /sequence/6681ac2f62509cfc220d78751b8dc524/
Range: bytes=230217-230217
HTTP/1.1 206 Partial Content
Content-Type: text/vnd.ga4gh.refget.v1.0.0+plain; charset=us-ascii
Content-Length: 1
C
Case 2
Circular or Non-circular Sequences Query parameters : NA Checksum Algorithm : MD5 Request Header: Accept Conditions : first-byte-spec = 0 and last-byte-spec => size of - 1 Description : Complete sequence will be retrieved regardless of the type (circular or non-circular) hence ignoring the Range header. Size of the sequence is 230218
GET /sequence/6681ac2f62509cfc220d78751b8dc524/
Range: bytes=0-230217
HTTP/1.1 206 OK
Content-Type: text/vnd.ga4gh.refget.v1.0.0+plain; charset=us-ascii
Content-Length: 230218
CCACA........GTGGG
GET /sequence/6681ac2f62509cfc220d78751b8dc524/
Range: bytes=0-999999999
HTTP/1.1 206 OK
Content-Type: text/vnd.ga4gh.refget.v1.0.0+plain; charset=us-ascii
Content-Length: 230218
CCACA........GTGGG