Same patch as D10394, but using H264 codec with small GOP size (2 frames in this case).
Main reason to use H264 is smaller filesize, current patch produces about 3x smaller files.
Playback is also about 1.5x faster than MJPEG
Complexity of this patch is much higher though, because encoder must work in own threadpool and encode frames with own gop size.
This means that scaled frames from encoder must be allocated per packet, which leads to worse performance overall. Patch is still about 3x faster than original. Interesting thing is that there is no performance improvement when running transcoding in more than 4 threads.
Possible ways to improve performance are to reuse memory for packets and frames, which would require redesign, where threads would wait for data to be written.
Another possibly much cleaner method would be to decode portions of input into multiple streams and remux these into proxy file.
In both cases managing complexity would be fairly important as it can get out of hand quite quickly.