Recently I've been considering what types of hardware setups might be the next natural step for training the model. In the past I've used a Jetson TX1 devkit, which is a rather modest piece of hardware and allows me to do a bit heavier finetuning. I have this running on its own in a back bedroom, and it's helpful to have a box that is dedicated to that rather than trying to share time with my laptop or a desktop, for example.
However, the dataset I train on has grown considerably, now over 200K images. This is large enough that I believe it may be reasonable to consider training from scratch rather than finetuning. Training from scratch, however, requires much more computing power - and ideally could run in significantly larger batch sizes.
In fact, one of the key things I wanted to be able to do was run batch sizes on par with training "modern" CNN's from scratch using ImageNet. This varies widely, but let's say ~100 images or so per batch as a reasonable target. Generally speaking, this means more GPU RAM.
However, I'm doing my best to keep on a relatively tight budget - ideally something around $1000. This is lower than many new setups typically run, even ones aimed at the budget-conscious consumer - see for example this blog post with this nice, new box settling in at a reasonable $1700. This necessitated some deep introspection about what I was truly try to achieve, and ultimately I settled on roughly this set of priorities:
- GPU RAM
- Speed
- Future expansion
With these priorities in mind, I considered several possibilities for a new or second-hand setup. With a different set of constraints than many considering a machine learning rig, I was pleasantly surprised by the sheer diversity of possible solutions.
- Purchase a Jetson AGX Xavier devkit. The price on these seems to have been reducing, and now the devkit just got upgraded to 32GB of RAM shared between GPU and CPU.
- Build a desktop and add higher-quality consumer-level cards like the popular 1080 TI or perhaps even the newer 2080 to it.
- Build a specialized rig and purchase many second-hand crypto mining rig cards - sort of a quantity over quality approach.
- Look to older server solutions and see what was available.
Ultimately I settled on an older GPU server by SuperMicro, the 2027GR-TRF. I currently plan to add one K80 to it, but it has support for up to two if I wish to expand in the future. I have recently been working on acquiring the full solution in various pieces, primarily through Ebay. I have gotten most of the parts but need a few more before I can put it all together, so stay tuned for more updates!
No comments:
Post a Comment