TIFF tile size limit

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

TIFF tile size limit

Bob Friesenhahn
Does anyone have an opinion on the maximum tile size which should be
allowed by a general purpose TIFF reader?

For example is 4096x4096 large enough to handle any practical TIFFs?

Bob
--
Bob Friesenhahn
[hidden email], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
_______________________________________________
Tiff mailing list: [hidden email]
http://lists.maptools.org/mailman/listinfo/tiff
http://www.remotesensing.org/libtiff/
Reply | Threaded
Open this post in threaded view
|

Re: TIFF tile size limit

Kemp Watson-2
What¹s a ³practical² TIFF? 65,536 pixels on a side and 2 GB/4 GB was the
absolute largest anyone could conceive of an image being, just a few years
agoŠ we¹re already a thousand times beyond that, with practical images.
I¹d say the same size as the maximum size of the image.. i.e. any
arbitrary subregion of the image.

With HEIF on the horizon, TIFF will need to be VERY flexible.

W. Kemp Watson

[hidden email]


Objective Pathology Services Limited

8250 Lawson Road
Milton, Ontario
Canada  L9T 5C6

www.objectivepathology.com
tel. +1 (416) 970-7284





On 2017-09-16, 3:50 PM, "Bob Friesenhahn" <[hidden email]
on behalf of [hidden email]> wrote:

>Does anyone have an opinion on the maximum tile size which should be
>allowed by a general purpose TIFF reader?
>
>For example is 4096x4096 large enough to handle any practical TIFFs?
>
>Bob
>--
>Bob Friesenhahn
>[hidden email], http://www.simplesystems.org/users/bfriesen/
>GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
>_______________________________________________
>Tiff mailing list: [hidden email]
>http://lists.maptools.org/mailman/listinfo/tiff
>http://www.remotesensing.org/libtiff/


_______________________________________________
Tiff mailing list: [hidden email]
http://lists.maptools.org/mailman/listinfo/tiff
http://www.remotesensing.org/libtiff/
Reply | Threaded
Open this post in threaded view
|

Re: TIFF tile size limit

rleigh
In reply to this post by Bob Friesenhahn
On 16/09/2017 19:50, Bob Friesenhahn wrote:
> Does anyone have an opinion on the maximum tile size which should be
> allowed by a general purpose TIFF reader?
>
> For example is 4096x4096 large enough to handle any practical TIFFs?

I'd venture to say any size up to and including the full image size.

There's no real upper limit for writing that I can see.  We already have
4k x 4k CCDs in widespread use; 8k x 8k are now available.  I could
write out the data in whatever size the acquisition hardware provides so
8k x 8k is a reality today and it's only going to get bigger.  Writing
out the data in smaller tiles has a (small) cost in rearranging the data
by copying to a tile buffer I might not want to pay during acquisition.
That cost increases with image size.

For reading there are a few tradeoffs.  We need the tiles to be big
enough that we don't do lots of separate reads since this is costly, but
small enough that we don't waste disc bandwidth and memory by reading
data we will not use.  It's often also helpful if it's in a size which
the graphics/compute hardware can cope with or else we need to further
split it if too big or coalesce if too small.  This will vary wildly
depending upon the application and the hardware.  OpenGL can have a low
limit, but CUDA and OpenCL can be much larger--8k x 8k is the smallest
allowed max size for OpenCL: see
https://www.khronos.org/registry/OpenCL/sdk/1.0/docs/man/xhtml/clGetDeviceInfo.html 
(CL_DEVICE_IMAGE2D_MAX_HEIGHT/WIDTH).  So I think it's fair to say that
a practical TIFF reader should be capable of at least 8k x 8k for
today's basic needs, and likely significantly more to be future proof.


Regards,
Roger
_______________________________________________
Tiff mailing list: [hidden email]
http://lists.maptools.org/mailman/listinfo/tiff
http://www.remotesensing.org/libtiff/
Reply | Threaded
Open this post in threaded view
|

Re: TIFF tile size limit

Bob Friesenhahn
On Sat, 16 Sep 2017, Roger Leigh wrote:

> On 16/09/2017 19:50, Bob Friesenhahn wrote:
>> Does anyone have an opinion on the maximum tile size which should be
>> allowed by a general purpose TIFF reader?
>>
>> For example is 4096x4096 large enough to handle any practical TIFFs?
>
> I'd venture to say any size up to and including the full image size.

Technically larger, since the tile size can be larger than the image
size.  There is nothing to prevent a one pixel image from claiming a
huge tile size.

> (CL_DEVICE_IMAGE2D_MAX_HEIGHT/WIDTH).  So I think it's fair to say that
> a practical TIFF reader should be capable of at least 8k x 8k for
> today's basic needs, and likely significantly more to be future proof.

256 MB per tile seems like quite a lot of memory to require from a
client.

Regardless, the problem I am seeing here is that the client uses
TIFFScanlineSize(), TIFFStripSize(), TIFFTileSize(), or
TIFFGetField(tiff,TIFFTAG_IMAGEWIDTH,&width) to determine the amount
of memory to allocate.  These can return huge sizes and it is
difficult for client software to know what sizes are valid.

Bob
--
Bob Friesenhahn
[hidden email], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
_______________________________________________
Tiff mailing list: [hidden email]
http://lists.maptools.org/mailman/listinfo/tiff
http://www.remotesensing.org/libtiff/
Reply | Threaded
Open this post in threaded view
|

Re: TIFF tile size limit

Kemp Watson-2
Is that a problem? With memory virtualized to disk, this becomes a storage
size problem, which is out of libtiff¹s hands. Big apps will always need
big systems.

W. Kemp Watson

[hidden email]




Objective Pathology Services Limited

8250 Lawson Road
Milton, Ontario
Canada  L9T 5C6

www.objectivepathology.com
tel. +1 (416) 970-7284








On 2017-09-16, 7:56 PM, "Bob Friesenhahn" <[hidden email]
on behalf of [hidden email]> wrote:

>On Sat, 16 Sep 2017, Roger Leigh wrote:
>
>> On 16/09/2017 19:50, Bob Friesenhahn wrote:
>>> Does anyone have an opinion on the maximum tile size which should be
>>> allowed by a general purpose TIFF reader?
>>>
>>> For example is 4096x4096 large enough to handle any practical TIFFs?
>>
>> I'd venture to say any size up to and including the full image size.
>
>Technically larger, since the tile size can be larger than the image
>size.  There is nothing to prevent a one pixel image from claiming a
>huge tile size.
>
>> (CL_DEVICE_IMAGE2D_MAX_HEIGHT/WIDTH).  So I think it's fair to say that
>> a practical TIFF reader should be capable of at least 8k x 8k for
>> today's basic needs, and likely significantly more to be future proof.
>
>256 MB per tile seems like quite a lot of memory to require from a
>client.
>
>Regardless, the problem I am seeing here is that the client uses
>TIFFScanlineSize(), TIFFStripSize(), TIFFTileSize(), or
>TIFFGetField(tiff,TIFFTAG_IMAGEWIDTH,&width) to determine the amount
>of memory to allocate.  These can return huge sizes and it is
>difficult for client software to know what sizes are valid.
>
>Bob
>--
>Bob Friesenhahn
>[hidden email], http://www.simplesystems.org/users/bfriesen/
>GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
>_______________________________________________
>Tiff mailing list: [hidden email]
>http://lists.maptools.org/mailman/listinfo/tiff
>http://www.remotesensing.org/libtiff/


_______________________________________________
Tiff mailing list: [hidden email]
http://lists.maptools.org/mailman/listinfo/tiff
http://www.remotesensing.org/libtiff/
Reply | Threaded
Open this post in threaded view
|

Re: TIFF tile size limit

rleigh
In reply to this post by Bob Friesenhahn
On 17/09/17 00:56, Bob Friesenhahn wrote:

> On Sat, 16 Sep 2017, Roger Leigh wrote:
>
>> On 16/09/2017 19:50, Bob Friesenhahn wrote:
>>> Does anyone have an opinion on the maximum tile size which should be
>>> allowed by a general purpose TIFF reader?
>>>
>>> For example is 4096x4096 large enough to handle any practical TIFFs?
>>
>> I'd venture to say any size up to and including the full image size.
>
> Technically larger, since the tile size can be larger than the image
> size.  There is nothing to prevent a one pixel image from claiming a
> huge tile size.

Yes.  Though if it's not rounded up to the next multiple of 16, and/or
the size is excessive, I think you would have reasonable grounds for
warning or erroring out on suspect-but-valid sizes.

>> (CL_DEVICE_IMAGE2D_MAX_HEIGHT/WIDTH).  So I think it's fair to say that
>> a practical TIFF reader should be capable of at least 8k x 8k for
>> today's basic needs, and likely significantly more to be future proof.
>
> 256 MB per tile seems like quite a lot of memory to require from a client.

It's a lot from one point of view.  But if you have images in the range
of multiple gigabytes to terabytes, it's a drop in the ocean, and you'd
have hardware to match to effectively process it.  Now we can easily get
workstations with 128 GiB and up, it's a reasonable chunk size,
particularly given that the hardware to deal with data of this size is
today's off the shelf mid-range hardware.

One thing which might be useful to explore is how best to store data of
this size in TIFF.  For example, is compression and the need for
temporary buffers to store the decompressed data worth it?  Or is it
better to memory map and use transparent filesystem compression at the
block level, e.g. ZFS with lz4/gzip.  This lets zlib and the application
use zero-copy I/O if all you're doing is handing over tiles, and it
means there's no real constraint to the tile size other than virtual
memory and available disc bandwidth.

I've started to benchmark some of this stuff, and I will certainly
report back the findings.

> Regardless, the problem I am seeing here is that the client uses
> TIFFScanlineSize(), TIFFStripSize(), TIFFTileSize(), or
> TIFFGetField(tiff,TIFFTAG_IMAGEWIDTH,&width) to determine the amount of
> memory to allocate.  These can return huge sizes and it is difficult for
> client software to know what sizes are valid.

That is certainly a tough problem to solve.  At least for tiles you can
do an initial sanity check that the size is a multiple of 16.  It would
be nice if such policies were configurable so the safeties can be turned
off or tuned.  Maybe you could factor in the use of memory mapping and
compression so that if it's uncompressed and mapped you allow
arbitrarily large sizes?

Tile/strip sizes gratuitously larger than the image size/width seem
potential candidates for being "too big".  Image width/height are tough
ones--we already have examples where the sizes are well over 400000
pixels on each side.  Maybe it's worth checking that the tile and strip
counts are correct, and match the image size without any full tile or
strip overlaps.  That is to say, rather than imposing a size constraint,
check that the metadata is valid and self-consistent for a well-formed
image.


Regards,
Roger
_______________________________________________
Tiff mailing list: [hidden email]
http://lists.maptools.org/mailman/listinfo/tiff
http://www.remotesensing.org/libtiff/
Reply | Threaded
Open this post in threaded view
|

Re: TIFF tile size limit

Bob Friesenhahn
In reply to this post by Kemp Watson-2
On Sat, 16 Sep 2017, Kemp Watson wrote:

> Is that a problem? With memory virtualized to disk, this becomes a storage
> size problem, which is out of libtiff¹s hands. Big apps will always need
> big systems.

It is a problem if a 390 byte file causes the reading application to
allocate tens of GB of memory.

Bob
--
Bob Friesenhahn
[hidden email], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
_______________________________________________
Tiff mailing list: [hidden email]
http://lists.maptools.org/mailman/listinfo/tiff
http://www.remotesensing.org/libtiff/
Reply | Threaded
Open this post in threaded view
|

Re: TIFF tile size limit

Kemp Watson-2

"Tile sizes are already allowed to be larger than the image dimensions by
the TIFF specification since they can spill over the right and bottom of
the image.  A one pixel image could be in a 1kx1k "tile²."

Ugh. That¹s the root of the issue, for sure.

Yes, the compression is not really the issue (LZW will compress whitespace
to almost nothing), it¹s the allocation of the raw rasters/tiles for the
decompressed data.

I just went back and re-read the Tiled Image specification. You are
completely correct, there is no formal restriction on tile size vs image
size, although there¹s a lot of ³not recommended² verbage. Personally, I¹d
be inclined to limit the tile size to the image size (technically, to the
16-pixel boundary just above the image size, or perhaps to the next larger
power-of-two size to provide 'quadrants'). That would break the
compatibility with the TIFF spec, though. Would that be a problem in
practice? Adobe¹s not really keeping the spec up to date with modern needs
anyway, and BigTIFF is not a spec either. The reality is that libtiff is
diverging from the TIFF 6.0 specification.

W. Kemp Watson

[hidden email]




Objective Pathology Services Limited

8250 Lawson Road
Milton, Ontario
Canada  L9T 5C6

www.objectivepathology.com
tel. +1 (416) 970-7284








On 2017-09-17, 11:04 AM, "Bob Friesenhahn" <[hidden email]>
wrote:

>On Sun, 17 Sep 2017, Kemp Watson wrote:
>
>> I know that we use very large bigtiffs, sometimes terabytes in size, but
>> currently we allocate only 512-pixel tiles. I could see that going to 4K
>> maximum in the near future, very practically.
>>
>> But, would limiting the tile size to be up to but not more than the full
>> image dimensions not essentially guarantee that a tiled implementation
>> would not use more memory than a full rasterized image (barring pointers
>> and small stuff)?
>
>Tile sizes are already allowed to be larger than the image dimensions
>by the TIFF specification since they can spill over the right and
>bottom of the image.  A one pixel image could be in a 1kx1k "tile".
>
>> I may well be missing some critical detail here - what in those sample
>> files is the root cause of the large allocations?
>
>That is a good question.  Some compressors are capable of remarkable
>compression ratios and so the files can claim large pixel dimensions
>although the file is very small.  In some cases it is not easy to know
>what a decoder can produce from a very small input.
>
>Even Rouault tells me that one of the compressors is theoretically
>capable of storing a 100000x100000 image in a few hundred bytes.
>
>Bob
>--
>Bob Friesenhahn
>[hidden email], http://www.simplesystems.org/users/bfriesen/
>GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/


_______________________________________________
Tiff mailing list: [hidden email]
http://lists.maptools.org/mailman/listinfo/tiff
http://www.remotesensing.org/libtiff/
Reply | Threaded
Open this post in threaded view
|

Re: TIFF tile size limit

Even Rouault-2

On dimanche 17 septembre 2017 11:29:24 CEST Kemp Watson wrote:

> "Tile sizes are already allowed to be larger than the image dimensions by

> the TIFF specification since they can spill over the right and bottom of

> the image. A one pixel image could be in a 1kx1k "tile²."

>

> Ugh. That¹s the root of the issue, for sure.

 

Another issue, more with the implementation of libtiff than with the spec itself, is if you have a big number of tiles or strips, the allocation of the StripByteCount and StripOffset arrays can be very costly. Let's take the case of a 1 million x 1 million image with tiles of 32 x 32. The allocation cost for those arrays is (1e6 / 16) * (1e6 / 16) * sizeof(uint64) * 2 = 62 GB (actually libtiff would refuse to read the content of a tag if it is more than 2GB large), and currently they are allocated and read entirely as soon as you read the first tile. libtiff should rather just read in the file the values of the tile byte count & offset for the tile it is going to read.

Even with a more reasonable tile size of 256x256, the cost is still 244 MB that you need to allocate and read at file opening.

 

>

> Yes, the compression is not really the issue (LZW will compress whitespace

> to almost nothing), it¹s the allocation of the raw rasters/tiles for the

> decompressed data.

>

> I just went back and re-read the Tiled Image specification. You are

> completely correct, there is no formal restriction on tile size vs image

> size, although there¹s a lot of ³not recommended² verbage. Personally, I¹d

> be inclined to limit the tile size to the image size (technically, to the

> 16-pixel boundary just above the image size, or perhaps to the next larger

> power-of-two size to provide 'quadrants'). That would break the

> compatibility with the TIFF spec, though. Would that be a problem in

> practice?

 

You'd probably break reading very small images (let's say 20x20) that use a standard tiling size of 256x256 (eg someone using GDAL's "gdal_translate in out.tif -co TILED=YES). So if we went on this direction, we should probably need to allow a "reasonable" tile size of let's say 2K x 2K even for 1x1 images.

 

> Adobe¹s not really keeping the spec up to date with modern needs

> anyway, and BigTIFF is not a spec either. The reality is that libtiff is

> diverging from the TIFF 6.0 specification.

>

> W. Kemp Watson

>

> [hidden email]

>

>

>

>

> Objective Pathology Services Limited

>

> 8250 Lawson Road

> Milton, Ontario

> Canada L9T 5C6

>

> www.objectivepathology.com

> tel. +1 (416) 970-7284

>

>

>

>

>

>

>

>

> On 2017-09-17, 11:04 AM, "Bob Friesenhahn" <[hidden email]>

>

> wrote:

> >On Sun, 17 Sep 2017, Kemp Watson wrote:

> >> I know that we use very large bigtiffs, sometimes terabytes in size, but

> >> currently we allocate only 512-pixel tiles. I could see that going to 4K

> >> maximum in the near future, very practically.

> >>

> >> But, would limiting the tile size to be up to but not more than the full

> >> image dimensions not essentially guarantee that a tiled implementation

> >> would not use more memory than a full rasterized image (barring pointers

> >> and small stuff)?

> >

> >Tile sizes are already allowed to be larger than the image dimensions

> >by the TIFF specification since they can spill over the right and

> >bottom of the image. A one pixel image could be in a 1kx1k "tile".

> >

> >> I may well be missing some critical detail here - what in those sample

> >> files is the root cause of the large allocations?

> >

> >That is a good question. Some compressors are capable of remarkable

> >compression ratios and so the files can claim large pixel dimensions

> >although the file is very small. In some cases it is not easy to know

> >what a decoder can produce from a very small input.

> >

> >Even Rouault tells me that one of the compressors is theoretically

> >capable of storing a 100000x100000 image in a few hundred bytes.

> >

> >Bob

>

> _______________________________________________

> Tiff mailing list: [hidden email]

> http://lists.maptools.org/mailman/listinfo/tiff

> http://www.remotesensing.org/libtiff/

 

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com


_______________________________________________
Tiff mailing list: [hidden email]
http://lists.maptools.org/mailman/listinfo/tiff
http://www.remotesensing.org/libtiff/
Reply | Threaded
Open this post in threaded view
|

Re: TIFF tile size limit

rleigh
On 17/09/17 17:27, Even Rouault wrote:

> On dimanche 17 septembre 2017 11:29:24 CEST Kemp Watson wrote:
>
>  > "Tile sizes are already allowed to be larger than the image dimensions by
>
>  > the TIFF specification since they can spill over the right and bottom of
>
>  > the image. A one pixel image could be in a 1kx1k "tile²."
>
>  >
>
>  > Ugh. That¹s the root of the issue, for sure.
>
> Another issue, more with the implementation of libtiff than with the
> spec itself, is if you have a big number of tiles or strips, the
> allocation of the StripByteCount and StripOffset arrays can be very
> costly. Let's take the case of a 1 million x 1 million image with tiles
> of 32 x 32. The allocation cost for those arrays is (1e6 / 16) * (1e6 /
> 16) * sizeof(uint64) * 2 = 62 GB (actually libtiff would refuse to read
> the content of a tag if it is more than 2GB large), and currently they
> are allocated and read entirely as soon as you read the first tile.
> libtiff should rather just read in the file the values of the tile byte
> count & offset for the tile it is going to read.
>
> Even with a more reasonable tile size of 256x256, the cost is still 244
> MB that you need to allocate and read at file opening.

This has definitely shown up with the (fairly crude) benchmarking I've
done so far.  Note "small" is 2¹⁴×2¹⁴ and "big" is 2¹⁶×2¹⁶.

File size variation:
https://github.com/openmicroscopy/ome-files-performance/blob/master/analysis/tile-test-write-size.pdf 
-- This is synthetic but derived from real data written by libtiff.
It's mainly to show how space is wasted when tiles overlap the image
border, but at the low end the metadata usage becomes significant; you
can see a small increase at the smallest tile sizes from the extra
count/offset array sizes.

Writing performance:
https://github.com/openmicroscopy/ome-files-performance/blob/master/analysis/tile-test-write-performance.pdf 
-- you can see an event more dramatic effect upon write times with very
small sizes after which it seems to become I/O bound and tile size
ceases to have any appreciable effect.  There's something which doesn't
scale with vast quantities of small tiles, but it will need more
investigation to determine exactly what.  Since it's negligible with any
tile size greater than 256 it's not a big priority for me right now, but
fixing it (if possible) would have some small benefit.

Tile counts:
https://github.com/openmicroscopy/ome-files-performance/blob/master/analysis/tile-test-count.pdf 
-- purely synthetic and fairly obvious, but demonstrates the scaling
problems with small tile sizes.


Regards,
Roger
_______________________________________________
Tiff mailing list: [hidden email]
http://lists.maptools.org/mailman/listinfo/tiff
http://www.remotesensing.org/libtiff/