A very cool custom video camera with AVFoundation
AVFoundation is a very cool framework that allows you to collect multimedia data generated by different input sources (camera, microphone, etc.) and redirect them to any output destinations (screen, speakers, etc.).
You can create custom playback and capture solutions for audio, video and still images. The advantage of using this framework with respect to the out-of-the-shelf solutions such as the
MPMoviePlayerController or the
UIImagePickerController is that you get access to the raw data of the camera. In this way, you can apply effects in real-time to the input signals for different purposes.
I have prepared for you a small app to show you how to use this framework and create a very cool video camera.
AVFoundation is based on the concept of the session. A session is used to control the flow of the data from an input to an output device. The creation of a session is really straightforward:
AVCaptureSession *session = [[AVCaptureSession alloc] init];
The session allows you to define the audio and video recording quality, using the
sessionPreset property of the
AVCaptureSession class. For this example, it’s fine to go for low quality data (so we save some battery cycle):
After the capture session has been created, you need to define the capture device you want to use. It can be the camera or the microphone. In this case, I am going to use the
AVMediaTypeVideo type that supports videos and images:
AVCaptureDevice *audioCaptureDevice = [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeAudio];
Capture Device Input
Next step you need to define the input of the capture device and add it to the session. Here you go:
AVCaptureDeviceInput *deviceInput = [AVCaptureDeviceInput deviceInputWithDevice:inputDevice error:&error];
if ( [session canAddInput:deviceInput] )
You check if you can add the device input to the session and if you can, you add it.
Before defining the device output, I want to show you how to preview the camera buffer. This will be the viewfinder of your camera, i.e. the preview of what the input device is seeing.
We can quickly render the raw data collected by the camera on the screen using the
AVCaptureVideoPreviewLayer. We can create this preview layer using the session we defined above and then add it to our main view layer:
AVCaptureVideoPreviewLayer *previewLayer = [[AVCaptureVideoPreviewLayer alloc] initWithSession:session];
CALayer *rootLayer = [[self view] layer];
[previewLayer setFrame:CGRectMake(-70, 0, rootLayer.bounds.size.height, rootLayer.bounds.size.height)];
[rootLayer insertSublayer:previewLayer atIndex:0];
You don’t need to do any additional work. You can now display the camera signal on your screen.
If you instead want to do some more cool stuffs, for example, if you want to process the camera signal to create nice video effects with Core Image or the Accelerate framework (give a look at this post), you need to collect the raw data generated by the camera, process them, and, if you like it, display them on the screen.
Go baby, go!!!
We are ready to go. The last thing you need to do is to start the session:
AVCaptureVideoPreviewLayer is a layer, you can obviously add animations to it. I am attaching here a very simple Xcode project showing the previous concepts. It creates a custom video camera with the preview rotating in the 3D space.
If you want to do some image processing with the raw data captured by the the camera and display the result on the screen, you need to collect those data, process them and render them on the screen without using the
AVCaptureVideoPreviewLayer. Depending on what you want to achieve, you have two main strategies:
- Either you capture a still picture as soon as you need one; or
- You capture continuously the video buffer
Now, the first approach is the simplest one: whenever you need to know what the camera is looking at, you just shoot a picture. Instead, if you want to process the video buffer, that’s more tricky, especially if your image processing algorithm is slower than the camera framerate output. Here, you need to evaluate which solution is more suitable for you case. Take into account that depending on the device you can get different image resolution. For example, the iPhone 4s can provide images up to 8 mega pixels. Now, that’s a lot of data to process in real-time. So, if you are doing real-time image processing, you need to accept some lower quality images. But all these considerations are a topic for a next post.