Wednesday, May 15, 2013

Propeller VGA-lyzer Part 2

The circuit for this is pretty simple - QX-10 video in one end of a Propeller protoboard, VGA out the other.  There's some resistors to protect the Propeller inputs.  Since the QX-10 sends 12V power down the video cable, I can power the Propeller board on the QX-10 video cable alone (the white connector on the right).

As it turns out this circuit is too simple, because I have the pixel input going straight into the Propeller with no preprocessing.  I accidentally dropped a zero in my back of the envelope calculations about the input data rate.  I estimated the pixel clock to be about 1.44 MHz, which can easily be sampled by a 20 MIPS Propeller core.  In fact the pixel clock is going to be in the 14.4 MHz range, and you need more than 20 million instructions per second to read and store that many pixels per second (you have to burn some of your instructions budget to shift, store, and loop).

There's various tricks I could do with the Propeller to get around this.  Some existing Propeller video drivers use as many as four cores working together to achieve higher video output resolutions than normally would be possible.  I could use a similar technique here to share the sampling load among multiple cores.  However going to such software gymnastics to save on hardware seems like it would make it harder to debug than is worthwhile.  Since there's no penalty for reading in data 8 bits at a time rather than 1 bit at a time, I can get a free performance improvement simply by sampling the pixel data into a 74HC595 8 bit shift register on the front end before consuming it in 8 bit chunks.  This will provide a second tangible advantage over the multiple-cores method:  with a shift register on the front end I can skew the phase of the pixel clock independently of, and more finely than, the microcontroller software instruction rate.  That's important because if you sample pixels at the right rate but the wrong time you'll get shimmering.  Optimizing the phase of the pixel sampling clock with respect to the video signal will get the best picture.

However that will be a future revision of the circuit; rather than make that hardware change immediately, I decided to see what I could accomplish with the hardware I have built.  (Who knows maybe I'll discover an additional tweak or feature I need to make to the hardware, so I might as well find out what I can.)

Although I will certainly need to use assembly language for the final product, I chose to use the interpreted language Spin for preliminary exploration, because it will be easier to debug.  Even though Spin is woefully slow compared to the video data rate, it does still allow accurate and precise wait timing, and there is a trick you can do with precision timing to capture a video frame slowly over time.  The idea is rather than try to capture the entire frame at once, you wait a precise amount of time after the sync signal to capture a certain single pixel from a line, then you have to wait an entire frame to capture the next and so on.  It is slow but it will work for static images.

Speaking of debugging, I also found that an important debugging technique is to have the software toggle one or more pins at certain times in the loops.  For instance originally I tried the slow-capture trick based off the horizontal sync signal, but when that didn't work I suspected my Spin code might be too slow for the 19kHz signal and may be dropping syncs.  By having the code toggle a pin every time it captured an hsync I was able to see with the scope (actually with its frequency counter feature) that indeed the code was indicating fewer hsyncs captured than expected, and it was unstable.

After discovering my Spin code could not keep up with the horizontal sync, I swapped the nesting of my "x" and "y" loops and switched to basing all of my video capture timing off the much slower vertical syncs.  Doing this with only vertical sync is 400 times slower than even the slow horizontal sync way.  I am able to read just one pixel per frame.

I was at last able to capture something from the QX-10 video signal!  Since it takes so long to capture a frame, I tweaked the code to concentrate on just the section of the frame where something interesting is onscreen, oversampled it and blew it up.  The image below is 2x super-sampled and 2x to 3x magnified.  I can't quite be sure, but I believe through the terrible mess of jitter and noise in that frame capture I can make out the message "INSERT DISKETTE".  (The E's are suggestive.)


I'm a little disappointed that the image isn't clearer, but I guess I should be amazed that it even works.  Because what is happening in this picture is that it is taking a single sync signal for the whole frame, waiting a calculated period of time that has to be accurate to an average of 1 part in 160,000, and capturing a single pixel out of hundreds of thousands.

No comments: