Author Topic: Camera/ chipset / x264 encoding pipeline  (Read 4055 times)

Offline jimd

  • Member
  • Posts: 19
    • View Profile
Camera/ chipset / x264 encoding pipeline
« on: November 24, 2010, 09:39:00 AM »
Hi

I'm looking at building/speccing a set-up for a decent-quality (576p or 720p), near-live (say 10-40 sec delay), encoding pipeline, direct from a camera at a remote location for a semi-static scene(e.g. zoo, scenic site, city site), then chunking in 3-5-10 sec chunks  ( x264, mbtree, one IDR at start), and transmitting wirelessly direct into some type of http chunked set-up  (e.g. apple live stream)   for delivery, and running in daylight hours

my current logic goes that the upload bandwidth from the remote location is probably the gating factor  (3g? - I'm in the UK) so its worth spending the effort doing good compression (x264, standard or slow settings)

it looks like a set-up with say an AMD 1055t (cheapish 6-core), and a solid-state-drive and 2GB Ram probably gives more than the grunt needed  (the system would be single purpose - primarily encoding and transmitting, but possibly with a lighter, more real time stream  (eg. lower res intra-refresh) for camera zoom/pan

where I have little/no experience is the best/easiest way to attach a camera/ livestream. Im assuming I want a good camera, which can selectively produce the correct output in a raw format(e.g. 576p, 720p or maybe later 1080p) - but then is it better to use USB2/3 off an essentially digital camera (e.g. logitech c910) or get a camcorder cam with usb/firewire/hdmi output  or look at more specialist cameras (found this thing called a flea3 which is usb and used for astronomy but is pricey, also looked at the security/ipcams but they have the encoding built in)

Anyway my questions are
i) Where's the best board to go discuss this kind of thing
ii) Has anyone got a more turn-key system already that uses x264
iii) any advice/feedback on the proposed set-up

Thanks
Jim

Offline wonsik

  • Member
  • Posts: 2
    • View Profile
Re: Camera/ chipset / x264 encoding pipeline
« Reply #1 on: November 24, 2010, 05:42:58 PM »
I think no encoder can encode 576p at 30fps within 200kbps (typical 3G network bandwidth limit) in decent quality. Probably you have to adjust the resolution or frame rate.

Offline Quarkboy

  • Member
  • Posts: 33
    • View Profile
Re: Camera/ chipset / x264 encoding pipeline
« Reply #2 on: November 24, 2010, 10:41:51 PM »
I think no encoder can encode 576p at 30fps within 200kbps (typical 3G network bandwidth limit) in decent quality. Probably you have to adjust the resolution or frame rate.
It would be fine for encoding a scene that really was almost entirely static.  But once you get any actual motion (or change in lighting), it would probably block to hell and back at that bitrate, yeah...

480p at 200 kbps can work for super artifical sources (like animation, for example), but real life video and all the noise it entails is pretty tough.  I would suggest doing 320x240 at half framerate, 15 fps.  That can be done over 200 kbps.

Does the UK have 4G (i.e. wimax) coming online? If you use a service like that which can hit 1000 mbps, then you can definitely do 480p at 30 fps.

Offline jimd

  • Member
  • Posts: 19
    • View Profile
Re: Camera/ chipset / x264 encoding pipeline
« Reply #3 on: November 25, 2010, 12:20:20 AM »
Sorry - I should have been clearer about the bit rate - I agree 200kb/s is a non-starter

it looks like some of the Orange or 3 HSUPA dongles are giving between 500 kb/s and 1.4Mb/s upload speeds

So I'm 'assuming' these kind of bit-rate - though beleive it is very dependent on local conditions (e.g upgraded base stations etc. which will probably not be the ones near where I want to stream)

background on hsupa upload speeds

e.g. http://www.3g.co.uk/3GForum/archive/index.php/t-100236.html
and

--------------- http://www.3g.co.uk/3GForum/archive/index.php/t-97718.html
planetf1
07-02-2010, 10:31 AM
When I had a quick test on my new N900 (bought retail unlocked) I got just over 1Mbps down, but a very nice 1.2 Mbps upstream.

Not sure how accurate that test was -- need to do it some more to be sure, but the N900 is my first phone with HSUPA (fast upstream)

The response times (ping) were also far quicker than I've seen before on a HSDPA only device at around 90ms
DBMandrake
07-02-2010, 10:46 AM
1.2Mbit is typical for upstream in HSUPA, and quite believable. As is a lower ping time than HSDPA alone.

HSDPA technology reduces the latency considerably (as well as increasing speeds) in the downstream direction compared to plain 3G/UMTS, but does nothing for the upstream. HSUPA adds the same kind of technology in the upstream direction as well, so improves not only upload speed but also upstream latency, and since ping times are the sum of downstream and upstream latency, the total latency is reduced :)
---------------

Offline wonsik

  • Member
  • Posts: 2
    • View Profile
Re: Camera/ chipset / x264 encoding pipeline
« Reply #4 on: November 25, 2010, 06:10:17 PM »
Well if the bandwidth is enough, you can just use VLC player (which depends on x264 for encoding). Check out http://www.videolan.org/doc/streaming-howto/en/

Offline jimd

  • Member
  • Posts: 19
    • View Profile
Re: Camera/ chipset / x264 encoding pipeline
« Reply #5 on: November 26, 2010, 07:50:46 AM »
I *think* I want something more complicated than VLC  - here's my reasoning

My primary focus is to get the best out of the variable bandwidth that a 3g/4g huspa upload will provide (as before - I beleive this is 600k-1.3Mb.s), so I'm transmitting a HD-ish (576p, 720p) pretty picture

So the plan would be

a) capture a certain amount of time worth of raw video at say 576p e.g. T1=10 seconds
b) use x264 to  i) find a decent place for a scene cut within a time range (e.g. T2=4-6 seconds)
c) use x264 to encode that cut at the 'best rate' current conditions allow (e.g. some kind of feedback based on the last upload time/ encoding time to choose both the preset rate and the crf/ target bandwidth
d) upload the encoded chunk as an mp4 to a source server  (
e) (in parallel with d) - go back to step b) and search for the next chunk in the buffer

This means the delay for the end user will be something like T1 + time to encode chunk (say 1x so 5 seconds) + time to transmit to source server + time to transmit to end user + need at least one more chunk in the end users buffer   - e.g. approx 30-40 seconds delay

A simpler approach would cut each chunk at a fixed time interval e.g. T1-T2= 5 seconds - but this will presumably lose some coding efficiency - but would be more friendly and predictable with Apple livestream

A more complicated version would feed back from step c) to potentially change the camera capture resolution e.g. bandwidth/compression outcome good so lets move from 576p to 720p

 I want to be able to experiment with the timeslots T1 and T2 - because I'm not sure if the users will prefer a longer overall delay (e.g. T1 ~ T2 ~ 20sec - so say 2 mins lag) and better quality, or a shorter delay and then a shorter feedback loop

So what I think I'm doing is tieing together a capture program, one or two instances of x264 and some scripting. If this doesn't work then I assume I'm working against the x264 api but that will be harder for me (may have to pay to get it done).  The risk of doing it at this low-level is that later adding e.g. audio may become more complex - but I think I don't currently want to lose control in the way I would if I went to ffmpeg or VLC.  Finally I see the 6-core AMD as a bit of a get-out-of jail because if something needs be handled in a separate process (e.g. find the right place for scene-cut, or upload the file chunk) then this can be done without perfomance loss.

<however - I'm new to this and maybe Im completely wrong and VLC/http transmission to a server is good enough>

<edit - in some ways what I'm trying to do is build a mini Spinnaker replacement that only handles one format, but spends more time trying to get the best encoding for the local conditions>

thanks
jim

« Last Edit: November 26, 2010, 08:03:24 AM by jimd »

Offline nm

  • Member
  • Posts: 358
    • View Profile
Re: Camera/ chipset / x264 encoding pipeline
« Reply #6 on: November 26, 2010, 08:18:45 AM »
I think that's too complex. Just feed the video to x264 and let its rc-lookahead do the job. Keyframe interval (divided by 2) determines how long the user needs to wait on average to see the first frame after starting to receive the stream. Encoding latency is probably dominated by VBV buffer size and how many frames x264 buffers for lookahead. Overall latency has many other factors too, of course.

VLC should work as a streaming platform, but there may be other good alternatives.

Btw. Have you read this already: http://x264dev.multimedia.cx/archives/249
« Last Edit: November 26, 2010, 08:27:07 AM by nm »

Offline jimd

  • Member
  • Posts: 19
    • View Profile
Re: Camera/ chipset / x264 encoding pipeline
« Reply #7 on: November 26, 2010, 09:15:25 AM »
Ok -thanks for the link- the comments were helpful,  but it is not 100% what I need as -intra-refresh is great, but as I understand will give lower compression as the trade-off for lower latency, and I'm really focussed on the best compression I can get as a trade-off for 30-40 secs of total latency, and also cutting into small-ish chunks makes content delivery, rewind and other things easier without needing a streaming server

I think I still have the issue that goes - If I want to get the efficiency of mb-tree/rc-lookahead, then don't I need to give x 264 a 'whole' file, or is it the case that if I pipe in the input, and pipe out the output, then x264 will spit out chunks at a time (presumably based on keyint etc.)
Is this output  then .264 ? so if I wanted to cut it and wrap it into separate mp4 chunks, I would do this separately - time to go and have a look

thanks
Jim

Offline nm

  • Member
  • Posts: 358
    • View Profile
Re: Camera/ chipset / x264 encoding pipeline
« Reply #8 on: November 26, 2010, 10:01:15 AM »
Ok -thanks for the link- the comments were helpful,  but it is not 100% what I need as -intra-refresh is great, but as I understand will give lower compression as the trade-off for lower latency, and I'm really focussed on the best compression I can get as a trade-off for 30-40 secs of total latency

Intra refresh helps with packet loss if your connection is bad, but you'll find out what's best when you test it in practice. I didn't mean to point to the low-latency and intra-refresh stuff but to the overall description of x264's properties in a streaming scenario.

Quote
and also cutting into small-ish chunks makes content delivery, rewind and other things easier without needing a streaming server

Well, sounds like you're going to code the streaming server stuff yourself to be able to stream and play those chunked files. Isn't that more difficult than using some existing solution that can be extended to store the stream and seek?

Quote
I think I still have the issue that goes - If I want to get the efficiency of mb-tree/rc-lookahead, then don't I need to give x 264 a 'whole' file, or is it the case that if I pipe in the input, and pipe out the output, then x264 will spit out chunks at a time (presumably based on keyint etc.)

I haven't used libx264 directly, but I guess it works in about the same way as the CLI: you pipe the raw video in, x264 buffers as many frames as it needs for lookahead and other stuff, and it outputs the encoded video as the encoding progresses, frame at a time (in decoding order). Or actually you call the encoding function for each frame that you want to output and it takes the time it needs.

Quote
Is this output  then .264 ?

If you use libx264 in your own application, you'll get an elementary stream that you can mux to whatever container you want.
« Last Edit: November 26, 2010, 10:07:50 AM by nm »

Offline jimd

  • Member
  • Posts: 19
    • View Profile
Re: Camera/ chipset / x264 encoding pipeline
« Reply #9 on: November 26, 2010, 02:24:11 PM »
Thanks for the points - it makes a lot of sense - I get my testing rig on Monday - so I'll report back some time later in the week on how it has gone.