MediaCodec播放音视频

2,703 阅读24分钟

目标

学习 MediaCodec API,完成音频 AAC 硬编、硬解 学习 MediaCodec API,完成视频 H.264 的硬编、硬解

MediaCodec类可以用于使用一些基本的多媒体编解码器(音视频编解码组件),它是Android基本的多媒体支持基础架构的一部分通常和 MediaExtractor, MediaSync, MediaMuxer, MediaCrypto, MediaDrm, Image, Surface, and AudioTrack 一起使用。

一个编解码器可以处理输入的数据来产生输出的数据,编解码器使用一组输入和输出缓冲器来异步处理数据。你可以创建一个空的输入缓冲区,填充数据后发送到编解码器进行处理。编解码器使用输入的数据进行转换,然后输出到一个空的输出缓冲区。最后你获取到输出缓冲区的数据,消耗掉里面的数据,释放回编解码器。如果后续还有数据需要继续处理,编解码器就会重复这些操作。输出流程如下:

编解码器能处理的数据类型为:压缩数据、原始音频数据和原始视频数据。你可以通过ByteBuffers能够处理这三种数据,但是需要你提供一个Surface,用于对原始的视频数据进行展示,这样也能提高编解码的性能。Surface使用的是本地的视频缓冲区,这个缓冲区不映射或拷贝到ByteBuffers。这样的机制让编解码器的效率更高。通常在使用Surface的时候,无法访问原始的视频数据,但是你可以使用ImageReader访问解码后的原始视频帧。在使用ByteBuffer的模式下,您可以使用Image类和getInput/OutputImage(int)访问原始视频帧。

编解码器的生命周期:

主要的生命周期为:Stopped、Executing、Released。

Stopped的状态下也分为三种子状态:Uninitialized、Configured、Error。

Executing的状态下也分为三种子状态:Flushed, Running、End-of-Stream。

  1. 当创建编解码器的时候处于未初始化状态。首先你需要调用configure(…)方法让它处于Configured状态,然后调用start()方法让其处于Executing状态。在Executing状态下,你就可以使用上面提到的缓冲区来处理数据。
  2. Executing的状态下也分为三种子状态:Flushed, Running、End-of-Stream。在start() 调用后,编解码器处于Flushed状态,这个状态下它保存着所有的缓冲区。一旦第一个输入buffer出现了,编解码器就会自动运行到Running的状态。当带有end-of-stream标志的buffer进去后,编解码器会进入End-of-Stream状态,这种状态下编解码器不在接受输入buffer,但是仍然在产生输出的buffer。此时你可以调用flush()方法,将编解码器重置于Flushed状态。
  3. 调用stop()将编解码器返回到未初始化状态,然后可以重新配置。 完成使用编解码器后,您必须通过调用release()来释放它。
  4. 在极少数情况下,编解码器可能会遇到错误并转到错误状态。 这是使用来自排队操作的无效返回值或有时通过异常来传达的。 调用reset()使编解码器再次可用。 您可以从任何状态调用它来将编解码器移回未初始化状态。 否则,调用 release()动到终端释放状态。

MediaCodec API 说明

MediaCodec可以处理具体的视频流,主要有这几个方法:

getInputBuffers:获取需要编码数据的输入流队列,返回的是一个ByteBuffer数组 
queueInputBuffer:输入流入队列 
dequeueInputBuffer:从输入流队列中取数据进行编码操作 
getOutputBuffers:获取编解码之后的数据输出流队列,返回的是一个ByteBuffer数组 
dequeueOutputBuffer:从输出队列中取出编码操作之后的数据 
releaseOutputBuffer:处理完成,释放ByteBuffer数据

MediaCodec 流控

流控就是流量控制。为什么要控制,因为条件有限!涉及到了 TCP 和视频编码:

对 TCP 来说就是控制单位时间内发送数据包的数据量,对编码来说就是控制单位时间内输出数据的数据量。

TCP 的限制条件是网络带宽,流控就是在避免造成或者加剧网络拥塞的前提下,尽可能利用网络带宽。带宽够、网络好,我们就加快速度发送数据包,出现了延迟增大、丢包之后,就放慢发包的速度(因为继续高速发包,可能会加剧网络拥塞,反而发得更慢)。

视频编码的限制条件最初是解码器的能力,码率太高就会无法解码,后来随着 codec 的发展,解码能力不再是瓶颈,限制条件变成了传输带宽/文件大小,我们希望在控制数据量的前提下,画面质量尽可能高。

一般编码器都可以设置一个目标码率,但编码器的实际输出码率不会完全符合设置,因为在编码过程中实际可以控制的并不是最终输出的码率,而是编码过程中的一个量化参数(Quantization Parameter,QP),它和码率并没有固定的关系,而是取决于图像内容。

无论是要发送的 TCP 数据包,还是要编码的图像,都可能出现“尖峰”,也就是短时间内出现较大的数据量。TCP 面对尖峰,可以选择不为所动(尤其是网络已经拥塞的时候),这没有太大的问题,但如果视频编码也对尖峰不为所动,那图像质量就会大打折扣了。如果有几帧数据量特别大,但仍要把码率控制在原来的水平,那势必要损失更多的信息,因此图像失真就会更严重。

Android 硬编码流控

MediaCodec 流控相关的接口并不多,一是配置时设置目标码率和码率控制模式,二是动态调整目标码率(Android 19 版本以上)。

配置时指定目标码率和码率控制模式:

mediaFormat.setInteger(MediaFormat.KEY_BIT_RATE, bitRate);
mediaFormat.setInteger(MediaFormat.KEY_BITRATE_MODE,
MediaCodecInfo.EncoderCapabilities.BITRATE_MODE_VBR);
mVideoCodec.configure(mediaFormat, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE);

码率控制模式有三种:

CQ  表示完全不控制码率,尽最大可能保证图像质量;
CBR 表示编码器会尽量把输出码率控制为设定值,即我们前面提到的“不为所动”;
VBR 表示编码器会根据图像内容的复杂度(实际上是帧间变化量的大小)来动态调整输出码率,图像复杂则码率高,图像简单则码率低;

动态调整目标码率:

Bundle param = new Bundle();
param.putInt(MediaCodec.PARAMETER_KEY_VIDEO_BITRATE, bitrate);
mediaCodec.setParameters(param);

质量要求高、不在乎带宽、解码器支持码率剧烈波动的情况下,可以选择 CQ 码率控制策略。

VBR 输出码率会在一定范围内波动,对于小幅晃动,方块效应会有所改善,但对剧烈晃动仍无能为力;连续调低码率则会导致码率急剧下降,如果无法接受这个问题,那 VBR 就不是好的选择。

CBR 的优点是稳定可控,这样对实时性的保证有帮助。所以 WebRTC 开发中一般使用的是CBR。

MediaCodec播放视频

视频解码后采用 TextureView 进行显示:

public class PlayerTextureView extends TextureView
	implements TextureView.SurfaceTextureListener, AspectRatioViewInterface {

	private double mRequestedAspect = -1.0;
	private Surface mSurface;

	public PlayerTextureView(Context context) {
		this(context, null, 0);
	}

	public PlayerTextureView(Context context, AttributeSet attrs) {
		this(context, attrs, 0);
	}

	public PlayerTextureView(Context context, AttributeSet attrs, int defStyle) {
		super(context, attrs, defStyle);
		setSurfaceTextureListener(this);
	}

	@Override
	public void onPause() {
	}

	@Override
	public void onResume() {
	}

	/**
	 * set aspect ratio of this view
	 * <code>aspect ratio = width / height</code>.
	 */
	public void setAspectRatio(double aspectRatio) {
		if (aspectRatio < 0) {
			throw new IllegalArgumentException();
		}
		if (mRequestedAspect != aspectRatio) {
			mRequestedAspect = aspectRatio;
			requestLayout();
		}
	}

	/**
	 * measure view size with keeping aspect ratio
	 */
	@Override
	protected void onMeasure(int widthMeasureSpec, int heightMeasureSpec) {

		if (mRequestedAspect > 0) {
			int initialWidth = MeasureSpec.getSize(widthMeasureSpec);
			int initialHeight = MeasureSpec.getSize(heightMeasureSpec);

			final int horizPadding = getPaddingLeft() + getPaddingRight();
			final int vertPadding = getPaddingTop() + getPaddingBottom();
			initialWidth -= horizPadding;
			initialHeight -= vertPadding;

			final double viewAspectRatio = (double)initialWidth / initialHeight;
			final double aspectDiff = mRequestedAspect / viewAspectRatio - 1;

			// stay size if the difference of calculated aspect ratio is small enough from specific value
			if (Math.abs(aspectDiff) > 0.01) {
				if (aspectDiff > 0) {
					// adjust heght from width
					initialHeight = (int) (initialWidth / mRequestedAspect);
				} else {
					// adjust width from height
					initialWidth = (int) (initialHeight * mRequestedAspect);
				}
				initialWidth += horizPadding;
				initialHeight += vertPadding;
				widthMeasureSpec = MeasureSpec.makeMeasureSpec(initialWidth, MeasureSpec.EXACTLY);
				heightMeasureSpec = MeasureSpec.makeMeasureSpec(initialHeight, MeasureSpec.EXACTLY);
			}
		}

		super.onMeasure(widthMeasureSpec, heightMeasureSpec);
	}

	@Override
	public void onSurfaceTextureAvailable(SurfaceTexture surface, int width, int height) {
		if (mSurface != null)
			mSurface.release();
		mSurface = new Surface(surface);
	}

	@Override
	public void onSurfaceTextureSizeChanged(SurfaceTexture surface, int width, int height) {
	}

	@Override
	public boolean onSurfaceTextureDestroyed(SurfaceTexture surface) {
		if (mSurface != null) {
			mSurface.release();
			mSurface = null;
		}
		return true;
	}

	@Override
	public void onSurfaceTextureUpdated(SurfaceTexture surface) {
	}

	public Surface getSurface() {
		return mSurface;
	}
}

自定义 PlayerTextureView 继承自 TextureView 。实现了 TextureView.SurfaceTextureListener 接口:

public interface SurfaceTextureListener {
    void onSurfaceTextureAvailable(SurfaceTexture var1, int var2, int var3);

    void onSurfaceTextureSizeChanged(SurfaceTexture var1, int var2, int var3);

    boolean onSurfaceTextureDestroyed(SurfaceTexture var1);

    void onSurfaceTextureUpdated(SurfaceTexture var1);
}

在 onSurfaceTextureAvailable 回调中获取 SurfaceTexture 后创建 Surface 作为视频的消费者:

@Override
public void onSurfaceTextureAvailable(SurfaceTexture surface, int width, int height) {
	if (mSurface != null)
		mSurface.release();
	mSurface = new Surface(surface);
}

在 onSurfaceTextureDestroyed 回调中进行 Surface 的销毁:

@Override
public boolean onSurfaceTextureDestroyed(SurfaceTexture surface) {
	if (mSurface != null) {
		mSurface.release();
		mSurface = null;
	}
	return true;
}

AspectRatioViewInterface 为实现的另一个接口,主要用于控制宽高比和 View 的显示状态:

public interface AspectRatioViewInterface {
    public void setAspectRatio(double aspectRatio);
    public void onPause();
    public void onResume();
}

播放的视频文件是MP4格式。接着进行视频的解码操作。MediaMetadataRetriever 类提供统一的接口,用于从输入媒体文件中检索帧和元数据:

//读取文件
final File src = new File(sourceFile);
//解析媒体文件
mMetadata = new MediaMetadataRetriever();
mMetadata.setDataSource(sourceFile);

通过 MediaMetadataRetriever 获取媒体文件的指定参数:视频宽、高、显示角度,比特率,时长,帧率

protected void updateMovieInfo() {
	mVideoWidth = mVideoHeight = mRotation = mBitrate = 0;
	mDuration = 0;
	mFrameRate = 0;
	String value = mMetadata.extractMetadata(MediaMetadataRetriever.METADATA_KEY_VIDEO_WIDTH);
	if (!TextUtils.isEmpty(value)) {
		mVideoWidth = Integer.parseInt(value);
	}
	value = mMetadata.extractMetadata(MediaMetadataRetriever.METADATA_KEY_VIDEO_HEIGHT);
	if (!TextUtils.isEmpty(value)) {
		mVideoHeight = Integer.parseInt(value);
	}
	value = mMetadata.extractMetadata(MediaMetadataRetriever.METADATA_KEY_VIDEO_ROTATION);
	if (!TextUtils.isEmpty(value)) {
		mRotation = Integer.parseInt(value);
	}
	value = mMetadata.extractMetadata(MediaMetadataRetriever.METADATA_KEY_BITRATE);
	if (!TextUtils.isEmpty(value)) {
		mBitrate = Integer.parseInt(value);
	}
	value = mMetadata.extractMetadata(MediaMetadataRetriever.METADATA_KEY_DURATION);
	if (!TextUtils.isEmpty(value)) {
		mDuration = Long.parseLong(value) * 1000;
	}
}

接下来就可以进行视频的解码操作。 MediaExtractor 有助于从数据源中提取解复用的,通常编码的媒体数据。

//设置需要解码的数据源
mVideoMediaExtractor = new MediaExtractor();
mVideoMediaExtractor.setDataSource(sourceFile);

因为一个多媒体文件中存在音频和视频,我们需要获取视频即可,所以需要进行信道的区分。

int trackIndex = selectTrack(mVideoMediaExtractor, "video/");
protected static final int selectTrack(final MediaExtractor extractor, final String mimeType) {
    final int numTracks = extractor.getTrackCount();
    MediaFormat format;
    String mime;
    for (int i = 0; i < numTracks; i++) {
        format = extractor.getTrackFormat(i);
        mime = format.getString(MediaFormat.KEY_MIME);
        if (mime.startsWith(mimeType)) {
            if (DEBUG) {
                Log.d(TAG_STATIC, "Extractor selected track " + i + " (" + mime + "): " + format);
            }
            return i;
        }
    }
    return -1;
}

如果取出的信道有效,我们可以根据它来通过 MediaExtractor 获取多媒体对应格式。

if (trackIndex >= 0) {
	mVideoMediaExtractor.selectTrack(trackIndex);
    final MediaFormat format = mVideoMediaExtractor.getTrackFormat(trackIndex);
	mVideoWidth = format.getInteger(MediaFormat.KEY_WIDTH);
	mVideoHeight = format.getInteger(MediaFormat.KEY_HEIGHT);
	mDuration = format.getLong(MediaFormat.KEY_DURATION);

	if (DEBUG) Log.v(TAG, String.format("format:size(%d,%d),duration=%d,bps=%d,framerate=%f,rotation=%d",
		mVideoWidth, mVideoHeight, mDuration, mBitrate, mFrameRate, mRotation));
}

上面属于文件的属性解析工作。完成之后就可以进行视频的播放工作。

MediaExtractor 支持滑动播放,通过 seekTo 函数,传入对应的事件戳即可跳转至指定时间戳进行解码。

mVideoMediaExtractor.seekTo(newTime, MediaExtractor.SEEK_TO_CLOSEST_SYNC);
mVideoMediaExtractor.advance();

解码操作具体使用 MediaCodec 。MediaCodec类可用于访问低级媒体编解码器,即编码器/解码器组件。 它是Android低级多媒体支持基础架构的一部分(通常与MediaExtractor,MediaSync,MediaMuxer,MediaCrypto,MediaDrm,Image,Surface和AudioTrack一起使用。)

MediaCodec 的创建需要传入指定的媒体类型。根据之前创建的 MediaExtractor 和获取的信道指数,即可获得对应信道的 MediaFormat 类型。通过 MediaFormat 可以获得媒体的 MINE 参数。传入 MediaCodec 进行解码器的创建。

final MediaFormat format = mediaExtractor.getTrackFormat(trackIndex);
final String mime = format.getString(MediaFormat.KEY_MIME);
try {
	codec = MediaCodec.createDecoderByType(mime);
	codec.configure(format, mOutputSurface, null, 0);
    codec.start();
} catch (final IOException e) {
	codec = null;
	Log.w(TAG, e);
}

configure 方法用于配置组件。MediaFormat:输入数据(解码器)的格式或输出数据(编码器)的所需格式。 传递null作为格式等同于传递MediaFormat#MediaFormat。该值可以为null。Surface:指定要在其上呈现此解码器输出的曲面。 如果编解码器不生成原始视频输出(例如,不是视频解码器)和/或如果要为ByteBuffer输出配置编解码器,则将null作为表面传递。 该值可以为null。MediaCrypto:指定加密对象以促进媒体数据的安全解密。对于非安全编解码器,将null作为加密传递。该值可以为null。int:指定CONFIGURE_FLAG_ENCODE以将组件配置为编码器。值为0或CONFIGURE_FLAG_ENCODE

public void configure (MediaFormat format, 
                Surface surface, 
                MediaCrypto crypto, 
                int flags)

成功配置组件后,请调用start。如果编解码器在异步模式下配置,并且刚刚刷新,则恢复请求输入缓冲区也会调用start。

配置完 MediaCodec 之后,通过它获取内置的 Buffer、InputBuffer、OutputBuffer 之后就可以进行解码操作。

Thread videoThread = null;
if (mVideoTrackIndex >= 0) {
	final MediaCodec codec = internalStartVideo(mVideoMediaExtractor, mVideoTrackIndex);
	if (codec != null) {
		mVideoMediaCodec = codec;
        mVideoBufferInfo = new MediaCodec.BufferInfo();
        mVideoInputBuffers = codec.getInputBuffers();
        mVideoOutputBuffers = codec.getOutputBuffers();
	}
	mVideoInputDone = mVideoOutputDone = false;
	videoThread = new Thread(mVideoTask, "VideoTask");
}
if (videoThread != null) videoThread.start();

因为解码操作没有固定的结束标志,所以只能根据播放器的播放状态进行流程的控制。所以解码任务需要在循环中处理。

private final Runnable mVideoTask = new Runnable() {
	@Override
	public void run() {
		if (DEBUG) Log.v(TAG, "VideoTask:start");
		for (; mIsRunning && !mVideoInputDone && !mVideoOutputDone ;) {
			try {
		        if (!mVideoInputDone) {
		        	handleInputVideo();
		        }
		        if (!mVideoOutputDone) {
					handleOutputVideo(mCallback);
		        }
			} catch (final Exception e) {
				Log.e(TAG, "VideoTask:", e);
				break;
			}
		} // end of for
		if (DEBUG) Log.v(TAG, "VideoTask:finished");
		synchronized (mSync) {
			mVideoInputDone = mVideoOutputDone = true;
			mSync.notifyAll();
		}
	}
};

解码后的数据需要进行入队处理。

private final void handleInputVideo() {
    long presentationTimeUs = mVideoMediaExtractor.getSampleTime();
    if (presentationTimeUs < previousVideoPresentationTimeUs) {
        presentationTimeUs += previousVideoPresentationTimeUs - presentationTimeUs; //  + EPS;
    }
    previousVideoPresentationTimeUs = presentationTimeUs;
    final boolean b = internalProcessInput(mVideoMediaCodec, mVideoMediaExtractor, mVideoInputBuffers,
            presentationTimeUs, false);
    if (!b) {
        if (DEBUG) Log.i(TAG, "video track input reached EOS");
        while (mIsRunning) {
            final int inputBufIndex = mVideoMediaCodec.dequeueInputBuffer(TIMEOUT_USEC);
            if (inputBufIndex >= 0) {
                mVideoMediaCodec.queueInputBuffer(inputBufIndex, 0, 0, 0L,
                        MediaCodec.BUFFER_FLAG_END_OF_STREAM);
                if (DEBUG) Log.v(TAG, "sent input EOS:" + mVideoMediaCodec);
                break;
            }
        }
        synchronized (mSync) {
            mVideoInputDone = true;
            mSync.notifyAll();
        }
    }
}
protected boolean internalProcessInput(final MediaCodec codec,
                                           final MediaExtractor extractor,
                                           final ByteBuffer[] inputBuffers,
                                           final long presentationTimeUs, final boolean isAudio) {

//		if (DEBUG) Log.v(TAG, "internalProcessInput:presentationTimeUs=" + presentationTimeUs);
    boolean result = true;
    while (mIsRunning) {
        final int inputBufIndex = codec.dequeueInputBuffer(TIMEOUT_USEC);
        if (inputBufIndex == MediaCodec.INFO_TRY_AGAIN_LATER)
            break;
        if (inputBufIndex >= 0) {
            final int size = extractor.readSampleData(inputBuffers[inputBufIndex], 0);
            if (size > 0) {
                codec.queueInputBuffer(inputBufIndex, 0, size, presentationTimeUs, 0);
            }
            result = extractor.advance();    // return false if no data is available
            break;
        }
    }
    return result;
}

入队处理之后,需要将数据从队列中取出。

private final void handleOutputVideo(final IFrameCallback frameCallback) {
//    	if (DEBUG) Log.v(TAG, "handleDrainVideo:");
    while (mIsRunning && !mVideoOutputDone) {
        final int decoderStatus = mVideoMediaCodec.dequeueOutputBuffer(mVideoBufferInfo, TIMEOUT_USEC);
        if (decoderStatus == MediaCodec.INFO_TRY_AGAIN_LATER) {
            return;
        } else if (decoderStatus == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) {
            mVideoOutputBuffers = mVideoMediaCodec.getOutputBuffers();
            if (DEBUG) Log.d(TAG, "INFO_OUTPUT_BUFFERS_CHANGED:");
        } else if (decoderStatus == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
            final MediaFormat newFormat = mVideoMediaCodec.getOutputFormat();
            if (DEBUG) Log.d(TAG, "video decoder output format changed: " + newFormat);
        } else if (decoderStatus < 0) {
            throw new RuntimeException(
                    "unexpected result from video decoder.dequeueOutputBuffer: " + decoderStatus);
        } else { // decoderStatus >= 0
            boolean doRender = false;
            if (mVideoBufferInfo.size > 0) {
                doRender = (mVideoBufferInfo.size != 0)
                        && !internalWriteVideo(mVideoOutputBuffers[decoderStatus],
                        0, mVideoBufferInfo.size, mVideoBufferInfo.presentationTimeUs);
                if (doRender) {
                    if (!frameCallback.onFrameAvailable(mVideoBufferInfo.presentationTimeUs))
                        mVideoStartTime = adjustPresentationTime(mVideoStartTime, mVideoBufferInfo.presentationTimeUs);
                }
            }
            mVideoMediaCodec.releaseOutputBuffer(decoderStatus, doRender);
            if ((mVideoBufferInfo.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) != 0) {
                if (DEBUG) Log.d(TAG, "video:output EOS");
                synchronized (mSync) {
                    mVideoOutputDone = true;
                    mSync.notifyAll();
                }
            }
        }
    }
}

此处我们不需要对解码出来的数据进行处理, internalWriteVideo 对应空实现。然后需要进行帧率的调整。

protected long adjustPresentationTime(
        final long startTime, final long presentationTimeUs) {

    if (startTime > 0) {
        for (long t = presentationTimeUs - (System.nanoTime() / 1000 - startTime);
             t > 0; t = presentationTimeUs - (System.nanoTime() / 1000 - startTime)) {
            synchronized (mVideoSync) {
                try {
                    mVideoSync.wait(t / 1000, (int) ((t % 1000) * 1000));
                } catch (final InterruptedException e) {
                    // ignore
                }
                if ((mState == REQ_STOP) || (mState == REQ_QUIT))
                    break;
            }
        }
        return startTime;
    } else {
        return System.nanoTime() / 1000;
    }
}

停止视频时需要,暂停对应的解码器的工作。

protected MediaExtractor mVideoMediaExtractor;
private MediaCodec mVideoMediaCodec;
private MediaCodec.BufferInfo mVideoBufferInfo;
private ByteBuffer[] mVideoInputBuffers;
private ByteBuffer[] mVideoOutputBuffers;
protected MediaMetadataRetriever mMetadata;


private final void handleStop() {
    if (DEBUG) Log.v(TAG, "handleStop:");
    synchronized (mVideoTask) {
        internalStopVideo();
        mVideoTrackIndex = -1;
    }
    if (mVideoMediaCodec != null) {
        mVideoMediaCodec.stop();
        mVideoMediaCodec.release();
        mVideoMediaCodec = null;
    }
    if (mVideoMediaExtractor != null) {
        mVideoMediaExtractor.release();
        mVideoMediaExtractor = null;
    }
    mVideoBufferInfo = null;
    mVideoInputBuffers = mVideoOutputBuffers = null;
    if (mMetadata != null) {
        mMetadata.release();
        mMetadata = null;
    }
    synchronized (mSync) {
        mVideoOutputDone = mVideoInputDone = true;
        mState = STATE_STOP;
    }
    mCallback.onFinished();
}

MediaCodec播放音视频

要在视频的基础上加入音频的处理,视频模块的部分不变。增加音频的处理。

mAudioTrackIndex = internalPrepareAudio(source_file);
protected int internalPrepareAudio(final String sourceFile) {
	int trackIndex = -1;
	mAudioMediaExtractor = new MediaExtractor();
	try {
		mAudioMediaExtractor.setDataSource(sourceFile);
		trackIndex = selectTrack(mAudioMediaExtractor, "audio/");
		if (trackIndex >= 0) {
			mAudioMediaExtractor.selectTrack(trackIndex);
	        final MediaFormat format = mAudioMediaExtractor.getTrackFormat(trackIndex);
	        mAudioChannels = format.getInteger(MediaFormat.KEY_CHANNEL_COUNT);
	        mAudioSampleRate = format.getInteger(MediaFormat.KEY_SAMPLE_RATE);
	        final int min_buf_size = AudioTrack.getMinBufferSize(mAudioSampleRate,
	        	(mAudioChannels == 1 ? AudioFormat.CHANNEL_OUT_MONO : AudioFormat.CHANNEL_OUT_STEREO),
	        	AudioFormat.ENCODING_PCM_16BIT);
	        final int max_input_size = format.getInteger(MediaFormat.KEY_MAX_INPUT_SIZE);
	        mAudioInputBufSize =  min_buf_size > 0 ? min_buf_size * 4 : max_input_size;
	        if (mAudioInputBufSize > max_input_size) mAudioInputBufSize = max_input_size;
	        final int frameSizeInBytes = mAudioChannels * 2;
	        mAudioInputBufSize = (mAudioInputBufSize / frameSizeInBytes) * frameSizeInBytes;
	        if (DEBUG) Log.v(TAG, String.format("getMinBufferSize=%d,max_input_size=%d,mAudioInputBufSize=%d",min_buf_size, max_input_size, mAudioInputBufSize));
	        //
	        mAudioTrack = new AudioTrack(AudioManager.STREAM_MUSIC,
	        	mAudioSampleRate,
	        	(mAudioChannels == 1 ? AudioFormat.CHANNEL_OUT_MONO : AudioFormat.CHANNEL_OUT_STEREO),
	        	AudioFormat.ENCODING_PCM_16BIT,
	        	mAudioInputBufSize,
	        	AudioTrack.MODE_STREAM);
	        try {
	        	mAudioTrack.play();
	        } catch (final Exception e) {
	        	Log.e(TAG, "failed to start audio track playing", e);
	    		mAudioTrack.release();
	        	mAudioTrack = null;
	        }
		}
	} catch (final IOException e) {
		Log.w(TAG, e);
	}
	return trackIndex;
}

音频与视频一样支持根据时间戳进行解码的定位。

mAudioMediaExtractor.seekTo(newTime, MediaExtractor.SEEK_TO_CLOSEST_SYNC);
mAudioMediaExtractor.advance();

接着进行音频文件的解码操作。

mAudioInputDone = mAudioOutputDone = true;
if (mAudioTrackIndex >= 0) {
	final MediaCodec codec = internalStartAudio(mAudioMediaExtractor, mAudioTrackIndex);
	if (codec != null) {
        mAudioMediaCodec = codec;
        mAudioBufferInfo = new MediaCodec.BufferInfo();
        mAudioInputBuffers = codec.getInputBuffers();
        mAudioOutputBuffers = codec.getOutputBuffers();
	}
	mAudioInputDone = mAudioOutputDone = false;
    audioThread = new Thread(mAudioTask, "AudioTask");
}
if (audioThread != null) audioThread.start();
protected MediaCodec internalStartAudio(final MediaExtractor media_extractor, final int trackIndex) {
	if (DEBUG) Log.v(TAG, "internalStartAudio:");
	MediaCodec codec = null;
	if (trackIndex >= 0) {
        final MediaFormat format = media_extractor.getTrackFormat(trackIndex);
        final String mime = format.getString(MediaFormat.KEY_MIME);
		try {
			codec = MediaCodec.createDecoderByType(mime);
			codec.configure(format, null, null, 0);
	        codec.start();
	    	if (DEBUG) Log.v(TAG, "internalStartAudio:codec started");
	    	//
	        final ByteBuffer[] buffers = codec.getOutputBuffers();
	        int sz = buffers[0].capacity();
	        if (sz <= 0)
	        	sz = mAudioInputBufSize;
	        if (DEBUG) Log.v(TAG, "AudioOutputBufSize:" + sz);
	        mAudioOutTempBuf = new byte[sz];
		} catch (final IOException e) {
			Log.w(TAG, e);
			codec = null;
		}
	}
	return codec;
}

音频解码任务是在线程中处理的。

private final Runnable mAudioTask = new Runnable() {
	@Override
	public void run() {
		if (DEBUG) Log.v(TAG, "AudioTask:start");
		for (; mIsRunning && !mAudioInputDone && !mAudioOutputDone ;) {
			try {
		        if (!mAudioInputDone) {
		        	handleInputAudio();
		        }
				if (!mAudioOutputDone) {
					handleOutputAudio(mCallback);
				}
			} catch (final Exception e) {
				Log.e(TAG, "VideoTask:", e);
				break;
			}
		} // end of for
		if (DEBUG) Log.v(TAG, "AudioTask:finished");
		synchronized (mAudioTask) {
			mAudioInputDone = mAudioOutputDone = true;
			mAudioTask.notifyAll();
		}
	}
};
private final void handleInputAudio() {
	final long presentationTimeUs = mAudioMediaExtractor.getSampleTime();
/*		if (presentationTimeUs < previousAudioPresentationTimeUs) {
		presentationTimeUs += previousAudioPresentationTimeUs - presentationTimeUs; //  + EPS;
	}
	previousAudioPresentationTimeUs = presentationTimeUs; */
    final boolean b = internal_process_input(mAudioMediaCodec, mAudioMediaExtractor, mAudioInputBuffers,
    		presentationTimeUs, true);
    if (!b) {
    	if (DEBUG) Log.i(TAG, "audio track input reached EOS");
		while (mIsRunning) {
            final int inputBufIndex = mAudioMediaCodec.dequeueInputBuffer(TIMEOUT_USEC);
            if (inputBufIndex >= 0) {
            	mAudioMediaCodec.queueInputBuffer(inputBufIndex, 0, 0, 0L,
            		MediaCodec.BUFFER_FLAG_END_OF_STREAM);
            	if (DEBUG) Log.v(TAG, "sent input EOS:" + mAudioMediaCodec);
            	break;
            }
    	}
		synchronized (mAudioTask) {
			mAudioInputDone = true;
			mAudioTask.notifyAll();
		}
   }
}
protected boolean internal_process_input(final MediaCodec codec, final MediaExtractor extractor, final ByteBuffer[] inputBuffers, final long presentationTimeUs, final boolean isAudio) {
//		if (DEBUG) Log.v(TAG, "internalProcessInput:presentationTimeUs=" + presentationTimeUs);
	boolean result = true;
	while (mIsRunning) {
        final int inputBufIndex = codec.dequeueInputBuffer(TIMEOUT_USEC);
        if (inputBufIndex == MediaCodec.INFO_TRY_AGAIN_LATER)
        	break;
        if (inputBufIndex >= 0) {
            final int size = extractor.readSampleData(inputBuffers[inputBufIndex], 0);
            if (size > 0) {
            	codec.queueInputBuffer(inputBufIndex, 0, size, presentationTimeUs, 0);
            }
        	result = extractor.advance();	// return false if no data is available
            break;
        }
	}
	return result;
}
private final void handleOutputAudio(final IFrameCallback frameCallback) {
//		if (DEBUG) Log.v(TAG, "handleDrainAudio:");
	while (mIsRunning && !mAudioOutputDone) {
		final int decoderStatus = mAudioMediaCodec.dequeueOutputBuffer(mAudioBufferInfo, TIMEOUT_USEC);
		if (decoderStatus == MediaCodec.INFO_TRY_AGAIN_LATER) {
			return;
		} else if (decoderStatus == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) {
			mAudioOutputBuffers = mAudioMediaCodec.getOutputBuffers();
			if (DEBUG) Log.d(TAG, "INFO_OUTPUT_BUFFERS_CHANGED:");
		} else if (decoderStatus == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
			final MediaFormat newFormat = mAudioMediaCodec.getOutputFormat();
			if (DEBUG) Log.d(TAG, "audio decoder output format changed: " + newFormat);
		} else if (decoderStatus < 0) {
			throw new RuntimeException(
				"unexpected result from audio decoder.dequeueOutputBuffer: " + decoderStatus);
		} else { // decoderStatus >= 0
			if (mAudioBufferInfo.size > 0) {
				internalWriteAudio(mAudioOutputBuffers[decoderStatus],
					0, mAudioBufferInfo.size, mAudioBufferInfo.presentationTimeUs);
				if (!frameCallback.onFrameAvailable(mAudioBufferInfo.presentationTimeUs))
					mAudioStartTime = adjustPresentationTime(mAudioSync, mAudioStartTime, mAudioBufferInfo.presentationTimeUs);
			}
			mAudioMediaCodec.releaseOutputBuffer(decoderStatus, false);
			if ((mAudioBufferInfo.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) != 0) {
				if (DEBUG) Log.d(TAG, "audio:output EOS");
				synchronized (mAudioTask) {
					mAudioOutputDone = true;
					mAudioTask.notifyAll();
				}
			}
		}
	}
}
protected boolean internalWriteAudio(final ByteBuffer buffer, final int offset, final int size, final long presentationTimeUs) {
//		if (DEBUG) Log.d(TAG, "internalWriteAudio");
    if (mAudioOutTempBuf.length < size) {
    	mAudioOutTempBuf = new byte[size];
    }
    buffer.position(offset);
    buffer.get(mAudioOutTempBuf, 0, size);
    buffer.clear();
    if (mAudioTrack != null)
    	mAudioTrack.write(mAudioOutTempBuf, 0, size);
    return true;
}
protected long adjustPresentationTime(final Object sync, final long startTime, final long presentationTimeUs) {
	if (startTime > 0) {
		for (long t = presentationTimeUs - (System.nanoTime() / 1000 - startTime);
				t > 0; t = presentationTimeUs - (System.nanoTime() / 1000 - startTime)) {
			synchronized (sync) {
				try {
					sync.wait(t / 1000, (int)((t % 1000) * 1000));
				} catch (final InterruptedException e) {
					// ignore
				}
				if ((mState == REQ_STOP) || (mState == REQ_QUIT))
					break;
			}
		}
		return startTime;
	} else {
		return System.nanoTime() / 1000;
	}
}

抽取mp4视频文件的帧缩略图

不考虑UI的实现, 涉及三个主要功能:

  1. 获取视频文件时长,
  2. 抽取视频文件的帧显示成一系列bitmap,
  3. 裁剪

思路

  1. 根据视频文件的时长, 和要截取的帧frame的间隔, 提前确定总的要截取的帧数, 和每一帧对应的时间点(毫秒数).
  2. 在通过MediaExtractor读取frame的过程中, 判断当前帧frame是否是要截取的帧, 怎么判断: 根据frame的时间点参数值 如果是要截取的frame, 那么记录下这个frame的序号. 后面一步要用
  3. 在MediaCodec解码的过程中, 根据前一步记录的frame的序号值, 来判断是否需要把当前frame转换成bitmap
/**
 * Extract frames from an MP4 using MediaExtractor, MediaCodec, and GLES.  Put a .mp4 file
 * in "/sdcard/source.mp4" and look for output files named "/sdcard/frame-XX.png".
 * <p>
 * This uses various features first available in Android "Jellybean" 4.1 (API 16).
 * <p>
 * (This was derived from bits and pieces of CTS tests, and is packaged as such, but is not
 * currently part of CTS.)
 */
public class ExtractMpegFramesTest {
    private static final String TAG = "ExtractMpegFramesTest";
    private static final boolean VERBOSE = true;           // lots of logging

    // where to find files (note: requires WRITE_EXTERNAL_STORAGE permission)
    private static final File FILES_DIR = Environment.getExternalStorageDirectory();
    private static final String INPUT_FILE = "Movies/demo_video.mp4";
    private static final int MAX_FRAMES = 10;       // stop extracting after this many

    /** test entry point */
    public void testExtractMpegFrames() {
        try {
            ExtractMpegFramesWrapper.runTest(this);
        } catch (Throwable throwable) {
            throwable.printStackTrace();
        }
    }

    /**
     * Wraps extractMpegFrames().  This is necessary because SurfaceTexture will try to use
     * the looper in the current thread if one exists, and the CTS tests create one on the
     * test thread.
     *
     * The wrapper propagates exceptions thrown by the worker thread back to the caller.
     */
    private static class ExtractMpegFramesWrapper implements Runnable {
        private Throwable mThrowable;
        private ExtractMpegFramesTest mTest;

        private ExtractMpegFramesWrapper(ExtractMpegFramesTest test) {
            mTest = test;
        }

        @Override
        public void run() {
            try {
                mTest.extractMpegFrames();
            } catch (Throwable th) {
                mThrowable = th;
            }
        }

        /** Entry point. */
        public static void runTest(ExtractMpegFramesTest obj) throws Throwable {
            ExtractMpegFramesWrapper wrapper = new ExtractMpegFramesWrapper(obj);
            Thread th = new Thread(wrapper, "codec test");
            th.start();
            th.join();
            if (wrapper.mThrowable != null) {
                throw wrapper.mThrowable;
            }
        }
    }

    /**
     * Tests extraction from an MP4 to a series of PNG files.
     * <p>
     * We scale the video to 640x480 for the PNG just to demonstrate that we can scale the
     * video with the GPU.  If the input video has a different aspect ratio, we could preserve
     * it by adjusting the GL viewport to get letterboxing or pillarboxing, but generally if
     * you're extracting frames you don't want black bars.
     */
    private void extractMpegFrames() throws IOException {
        MediaCodec decoder = null;
        CodecOutputSurface outputSurface = null;
        MediaExtractor extractor = null;
        int saveWidth = 640;
        int saveHeight = 480;

        try {
            File inputFile = new File(FILES_DIR, INPUT_FILE);   // must be an absolute path
            // The MediaExtractor error messages aren't very useful.  Check to see if the input
            // file exists so we can throw a better one if it's not there.
            if (!inputFile.canRead()) {
                throw new FileNotFoundException("Unable to read " + inputFile);
            }

            extractor = new MediaExtractor();
            extractor.setDataSource(inputFile.toString());
            int trackIndex = selectTrack(extractor);
            if (trackIndex < 0) {
                throw new RuntimeException("No video track found in " + inputFile);
            }
            extractor.selectTrack(trackIndex);

            MediaFormat format = extractor.getTrackFormat(trackIndex);
            if (VERBOSE) {
                Log.d(TAG, "Video size is " + format.getInteger(MediaFormat.KEY_WIDTH) + "x" +
                        format.getInteger(MediaFormat.KEY_HEIGHT));
            }

            // Could use width/height from the MediaFormat to get full-size frames.
            outputSurface = new CodecOutputSurface(saveWidth, saveHeight);

            // Create a MediaCodec decoder, and configure it with the MediaFormat from the
            // extractor.  It's very important to use the format from the extractor because
            // it contains a copy of the CSD-0/CSD-1 codec-specific data chunks.
            String mime = format.getString(MediaFormat.KEY_MIME);
            decoder = MediaCodec.createDecoderByType(mime);
            decoder.configure(format, outputSurface.getSurface(), null, 0);
            decoder.start();

            doExtract(extractor, trackIndex, decoder, outputSurface);
        } finally {
            // release everything we grabbed
            if (outputSurface != null) {
                outputSurface.release();
                outputSurface = null;
            }
            if (decoder != null) {
                decoder.stop();
                decoder.release();
                decoder = null;
            }
            if (extractor != null) {
                extractor.release();
                extractor = null;
            }
        }
    }

    /**
     * Selects the video track, if any.
     *
     * @return the track index, or -1 if no video track is found.
     */
    private int selectTrack(MediaExtractor extractor) {
        // Select the first video track we find, ignore the rest.
        int numTracks = extractor.getTrackCount();
        for (int i = 0; i < numTracks; i++) {
            MediaFormat format = extractor.getTrackFormat(i);
            String mime = format.getString(MediaFormat.KEY_MIME);
            if (mime.startsWith("video/")) {
                if (VERBOSE) {
                    Log.d(TAG, "Extractor selected track " + i + " (" + mime + "): " + format);
                }
                return i;
            }
        }

        return -1;
    }

    /**
     * Work loop.
     */
    static void doExtract(MediaExtractor extractor, int trackIndex, MediaCodec decoder,
                          CodecOutputSurface outputSurface) throws IOException {
        final int TIMEOUT_USEC = 10000;
        ByteBuffer[] decoderInputBuffers = decoder.getInputBuffers();
        MediaCodec.BufferInfo info = new MediaCodec.BufferInfo();
        int inputChunk = 0;
        int decodeCount = 0;
        long frameSaveTime = 0;

        boolean outputDone = false;
        boolean inputDone = false;
        while (!outputDone) {
            if (VERBOSE) Log.d(TAG, "loop");

            // Feed more data to the decoder.
            if (!inputDone) {
                int inputBufIndex = decoder.dequeueInputBuffer(TIMEOUT_USEC);
                if (inputBufIndex >= 0) {
                    ByteBuffer inputBuf = decoderInputBuffers[inputBufIndex];
                    // Read the sample data into the ByteBuffer.  This neither respects nor
                    // updates inputBuf's position, limit, etc.
                    int chunkSize = extractor.readSampleData(inputBuf, 0);
                    if (chunkSize < 0) {
                        // End of stream -- send empty frame with EOS flag set.
                        decoder.queueInputBuffer(inputBufIndex, 0, 0, 0L,
                                MediaCodec.BUFFER_FLAG_END_OF_STREAM);
                        inputDone = true;
                        if (VERBOSE) Log.d(TAG, "sent input EOS");
                    } else {
                        if (extractor.getSampleTrackIndex() != trackIndex) {
                            Log.w(TAG, "WEIRD: got sample from track " +
                                    extractor.getSampleTrackIndex() + ", expected " + trackIndex);
                        }
                        long presentationTimeUs = extractor.getSampleTime();
                        decoder.queueInputBuffer(inputBufIndex, 0, chunkSize,
                                presentationTimeUs, 0 /*flags*/);
                        if (VERBOSE) {
                            Log.d(TAG, "submitted frame " + inputChunk + " to dec, size=" +
                                    chunkSize);
                        }
                        inputChunk++;
                        extractor.advance();
                    }
                } else {
                    if (VERBOSE) Log.d(TAG, "input buffer not available");
                }
            }

            if (!outputDone) {
                int decoderStatus = decoder.dequeueOutputBuffer(info, TIMEOUT_USEC);
                if (decoderStatus == MediaCodec.INFO_TRY_AGAIN_LATER) {
                    // no output available yet
                    if (VERBOSE) Log.d(TAG, "no output from decoder available");
                } else if (decoderStatus == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) {
                    // not important for us, since we're using Surface
                    if (VERBOSE) Log.d(TAG, "decoder output buffers changed");
                } else if (decoderStatus == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
                    MediaFormat newFormat = decoder.getOutputFormat();
                    if (VERBOSE) Log.d(TAG, "decoder output format changed: " + newFormat);
                } else if (decoderStatus < 0) {
                    Log.d(TAG , "unexpected result from decoder.dequeueOutputBuffer: " + decoderStatus);
                } else { // decoderStatus >= 0
                    if (VERBOSE) Log.d(TAG, "surface decoder given buffer " + decoderStatus +
                            " (size=" + info.size + ")");
                    if ((info.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) != 0) {
                        if (VERBOSE) Log.d(TAG, "output EOS");
                        outputDone = true;
                    }

                    boolean doRender = (info.size != 0);

                    // As soon as we call releaseOutputBuffer, the buffer will be forwarded
                    // to SurfaceTexture to convert to a texture.  The API doesn't guarantee
                    // that the texture will be available before the call returns, so we
                    // need to wait for the onFrameAvailable callback to fire.
                    decoder.releaseOutputBuffer(decoderStatus, doRender);
                    if (doRender) {
                        if (VERBOSE) Log.d(TAG, "awaiting decode of frame " + decodeCount);
                        outputSurface.awaitNewImage();
                        outputSurface.drawImage(true);

                        if (decodeCount < MAX_FRAMES) {
                            File outputFile = new File(FILES_DIR,
                                    String.format("frame-%02d.png", decodeCount));
                            long startWhen = System.nanoTime();
                            outputSurface.saveFrame(outputFile.toString());
                            frameSaveTime += System.nanoTime() - startWhen;
                        }
                        decodeCount++;
                    }
                }
            }
        }

        int numSaved = (MAX_FRAMES < decodeCount) ? MAX_FRAMES : decodeCount;
        Log.d(TAG, "Saving " + numSaved + " frames took " +
                (frameSaveTime / numSaved / 1000) + " us per frame");
    }


    /**
     * Holds state associated with a Surface used for MediaCodec decoder output.
     * <p>
     * The constructor for this class will prepare GL, create a SurfaceTexture,
     * and then create a Surface for that SurfaceTexture.  The Surface can be passed to
     * MediaCodec.configure() to receive decoder output.  When a frame arrives, we latch the
     * texture with updateTexImage(), then render the texture with GL to a pbuffer.
     * <p>
     * By default, the Surface will be using a BufferQueue in asynchronous mode, so we
     * can potentially drop frames.
     */
    private static class CodecOutputSurface
            implements SurfaceTexture.OnFrameAvailableListener {
        private ExtractMpegFramesTest.STextureRender mTextureRender;
        private SurfaceTexture mSurfaceTexture;
        private Surface mSurface;

        private EGLDisplay mEGLDisplay = EGL14.EGL_NO_DISPLAY;
        private EGLContext mEGLContext = EGL14.EGL_NO_CONTEXT;
        private EGLSurface mEGLSurface = EGL14.EGL_NO_SURFACE;
        int mWidth;
        int mHeight;

        private Object mFrameSyncObject = new Object();     // guards mFrameAvailable
        private boolean mFrameAvailable;

        private ByteBuffer mPixelBuf;                       // used by saveFrame()

        /**
         * Creates a CodecOutputSurface backed by a pbuffer with the specified dimensions.  The
         * new EGL context and surface will be made current.  Creates a Surface that can be passed
         * to MediaCodec.configure().
         */
        public CodecOutputSurface(int width, int height) {
            if (width <= 0 || height <= 0) {
                throw new IllegalArgumentException();
            }
            mWidth = width;
            mHeight = height;

            eglSetup();
            makeCurrent();
            setup();
        }

        /**
         * Creates interconnected instances of TextureRender, SurfaceTexture, and Surface.
         */
        private void setup() {
            mTextureRender = new ExtractMpegFramesTest.STextureRender();
            mTextureRender.surfaceCreated();

            if (VERBOSE) Log.d(TAG, "textureID=" + mTextureRender.getTextureId());
            mSurfaceTexture = new SurfaceTexture(mTextureRender.getTextureId());

            // This doesn't work if this object is created on the thread that CTS started for
            // these test cases.
            //
            // The CTS-created thread has a Looper, and the SurfaceTexture constructor will
            // create a Handler that uses it.  The "frame available" message is delivered
            // there, but since we're not a Looper-based thread we'll never see it.  For
            // this to do anything useful, CodecOutputSurface must be created on a thread without
            // a Looper, so that SurfaceTexture uses the main application Looper instead.
            //
            // Java language note: passing "this" out of a constructor is generally unwise,
            // but we should be able to get away with it here.
            mSurfaceTexture.setOnFrameAvailableListener(this);

            mSurface = new Surface(mSurfaceTexture);

            mPixelBuf = ByteBuffer.allocateDirect(mWidth * mHeight * 4);
            mPixelBuf.order(ByteOrder.LITTLE_ENDIAN);
        }

        /**
         * Prepares EGL.  We want a GLES 2.0 context and a surface that supports pbuffer.
         */
        private void eglSetup() {
            mEGLDisplay = EGL14.eglGetDisplay(EGL14.EGL_DEFAULT_DISPLAY);
            if (mEGLDisplay == EGL14.EGL_NO_DISPLAY) {
                throw new RuntimeException("unable to get EGL14 display");
            }
            int[] version = new int[2];
            if (!EGL14.eglInitialize(mEGLDisplay, version, 0, version, 1)) {
                mEGLDisplay = null;
                throw new RuntimeException("unable to initialize EGL14");
            }

            // Configure EGL for pbuffer and OpenGL ES 2.0, 24-bit RGB.
            int[] attribList = {
                    EGL14.EGL_RED_SIZE, 8,
                    EGL14.EGL_GREEN_SIZE, 8,
                    EGL14.EGL_BLUE_SIZE, 8,
                    EGL14.EGL_ALPHA_SIZE, 8,
                    EGL14.EGL_RENDERABLE_TYPE, EGL14.EGL_OPENGL_ES2_BIT,
                    EGL14.EGL_SURFACE_TYPE, EGL14.EGL_PBUFFER_BIT,
                    EGL14.EGL_NONE
            };
            EGLConfig[] configs = new EGLConfig[1];
            int[] numConfigs = new int[1];
            if (!EGL14.eglChooseConfig(mEGLDisplay, attribList, 0, configs, 0, configs.length,
                    numConfigs, 0)) {
                throw new RuntimeException("unable to find RGB888+recordable ES2 EGL config");
            }

            // Configure context for OpenGL ES 2.0.
            int[] attrib_list = {
                    EGL14.EGL_CONTEXT_CLIENT_VERSION, 2,
                    EGL14.EGL_NONE
            };
            mEGLContext = EGL14.eglCreateContext(mEGLDisplay, configs[0], EGL14.EGL_NO_CONTEXT,
                    attrib_list, 0);
            checkEglError("eglCreateContext");
            if (mEGLContext == null) {
                throw new RuntimeException("null context");
            }

            // Create a pbuffer surface.
            int[] surfaceAttribs = {
                    EGL14.EGL_WIDTH, mWidth,
                    EGL14.EGL_HEIGHT, mHeight,
                    EGL14.EGL_NONE
            };
            mEGLSurface = EGL14.eglCreatePbufferSurface(mEGLDisplay, configs[0], surfaceAttribs, 0);
            checkEglError("eglCreatePbufferSurface");
            if (mEGLSurface == null) {
                throw new RuntimeException("surface was null");
            }
        }

        /**
         * Discard all resources held by this class, notably the EGL context.
         */
        public void release() {
            if (mEGLDisplay != EGL14.EGL_NO_DISPLAY) {
                EGL14.eglDestroySurface(mEGLDisplay, mEGLSurface);
                EGL14.eglDestroyContext(mEGLDisplay, mEGLContext);
                EGL14.eglReleaseThread();
                EGL14.eglTerminate(mEGLDisplay);
            }
            mEGLDisplay = EGL14.EGL_NO_DISPLAY;
            mEGLContext = EGL14.EGL_NO_CONTEXT;
            mEGLSurface = EGL14.EGL_NO_SURFACE;

            mSurface.release();

            // this causes a bunch of warnings that appear harmless but might confuse someone:
            //  W BufferQueue: [unnamed-3997-2] cancelBuffer: BufferQueue has been abandoned!
            //mSurfaceTexture.release();

            mTextureRender = null;
            mSurface = null;
            mSurfaceTexture = null;
        }

        /**
         * Makes our EGL context and surface current.
         */
        public void makeCurrent() {
            if (!EGL14.eglMakeCurrent(mEGLDisplay, mEGLSurface, mEGLSurface, mEGLContext)) {
                throw new RuntimeException("eglMakeCurrent failed");
            }
        }

        /**
         * Returns the Surface.
         */
        public Surface getSurface() {
            return mSurface;
        }

        /**
         * Latches the next buffer into the texture.  Must be called from the thread that created
         * the CodecOutputSurface object.  (More specifically, it must be called on the thread
         * with the EGLContext that contains the GL texture object used by SurfaceTexture.)
         */
        public void awaitNewImage() {
            final int TIMEOUT_MS = 2500;

            synchronized (mFrameSyncObject) {
                while (!mFrameAvailable) {
                    try {
                        // Wait for onFrameAvailable() to signal us.  Use a timeout to avoid
                        // stalling the test if it doesn't arrive.
                        mFrameSyncObject.wait(TIMEOUT_MS);
                        if (!mFrameAvailable) {
                            // TODO: if "spurious wakeup", continue while loop
                            throw new RuntimeException("frame wait timed out");
                        }
                    } catch (InterruptedException ie) {
                        // shouldn't happen
                        throw new RuntimeException(ie);
                    }
                }
                mFrameAvailable = false;
            }

            // Latch the data.
            mTextureRender.checkGlError("before updateTexImage");
            mSurfaceTexture.updateTexImage();
        }

        /**
         * Draws the data from SurfaceTexture onto the current EGL surface.
         *
         * @param invert if set, render the image with Y inverted (0,0 in top left)
         */
        public void drawImage(boolean invert) {
            mTextureRender.drawFrame(mSurfaceTexture, invert);
        }

        // SurfaceTexture callback
        @Override
        public void onFrameAvailable(SurfaceTexture st) {
            if (VERBOSE) Log.d(TAG, "new frame available");
            synchronized (mFrameSyncObject) {
                if (mFrameAvailable) {
                    throw new RuntimeException("mFrameAvailable already set, frame could be dropped");
                }
                mFrameAvailable = true;
                mFrameSyncObject.notifyAll();
            }
        }

        /**
         * Saves the current frame to disk as a PNG image.
         */
        public void saveFrame(String filename) throws IOException {
            // glReadPixels gives us a ByteBuffer filled with what is essentially big-endian RGBA
            // data (i.e. a byte of red, followed by a byte of green...).  To use the Bitmap
            // constructor that takes an int[] array with pixel data, we need an int[] filled
            // with little-endian ARGB data.
            //
            // If we implement this as a series of buf.get() calls, we can spend 2.5 seconds just
            // copying data around for a 720p frame.  It's better to do a bulk get() and then
            // rearrange the data in memory.  (For comparison, the PNG compress takes about 500ms
            // for a trivial frame.)
            //
            // So... we set the ByteBuffer to little-endian, which should turn the bulk IntBuffer
            // get() into a straight memcpy on most Android devices.  Our ints will hold ABGR data.
            // Swapping B and R gives us ARGB.  We need about 30ms for the bulk get(), and another
            // 270ms for the color swap.
            //
            // We can avoid the costly B/R swap here if we do it in the fragment shader (see
            // http://stackoverflow.com/questions/21634450/ ).
            //
            // Having said all that... it turns out that the Bitmap#copyPixelsFromBuffer()
            // method wants RGBA pixels, not ARGB, so if we create an empty bitmap and then
            // copy pixel data in we can avoid the swap issue entirely, and just copy straight
            // into the Bitmap from the ByteBuffer.
            //
            // Making this even more interesting is the upside-down nature of GL, which means
            // our output will look upside-down relative to what appears on screen if the
            // typical GL conventions are used.  (For ExtractMpegFrameTest, we avoid the issue
            // by inverting the frame when we render it.)
            //
            // Allocating large buffers is expensive, so we really want mPixelBuf to be
            // allocated ahead of time if possible.  We still get some allocations from the
            // Bitmap / PNG creation.

            mPixelBuf.rewind();
            GLES20.glReadPixels(0, 0, mWidth, mHeight, GLES20.GL_RGBA, GLES20.GL_UNSIGNED_BYTE,
                    mPixelBuf);

            BufferedOutputStream bos = null;
            try {
                bos = new BufferedOutputStream(new FileOutputStream(filename));
                Bitmap bmp = Bitmap.createBitmap(mWidth, mHeight, Bitmap.Config.ARGB_8888);
                mPixelBuf.rewind();
                bmp.copyPixelsFromBuffer(mPixelBuf);
                bmp.compress(Bitmap.CompressFormat.PNG, 90, bos);
                bmp.recycle();
            } finally {
                if (bos != null) bos.close();
            }
            if (VERBOSE) {
                Log.d(TAG, "Saved " + mWidth + "x" + mHeight + " frame as '" + filename + "'");
            }
        }

        /**
         * Checks for EGL errors.
         */
        private void checkEglError(String msg) {
            int error;
            if ((error = EGL14.eglGetError()) != EGL14.EGL_SUCCESS) {
                throw new RuntimeException(msg + ": EGL error: 0x" + Integer.toHexString(error));
            }
        }
    }


    /**
     * Code for rendering a texture onto a surface using OpenGL ES 2.0.
     */
    private static class STextureRender {
        private static final int FLOAT_SIZE_BYTES = 4;
        private static final int TRIANGLE_VERTICES_DATA_STRIDE_BYTES = 5 * FLOAT_SIZE_BYTES;
        private static final int TRIANGLE_VERTICES_DATA_POS_OFFSET = 0;
        private static final int TRIANGLE_VERTICES_DATA_UV_OFFSET = 3;
        private final float[] mTriangleVerticesData = {
                // X, Y, Z, U, V
                -1.0f, -1.0f, 0, 0.f, 0.f,
                1.0f, -1.0f, 0, 1.f, 0.f,
                -1.0f,  1.0f, 0, 0.f, 1.f,
                1.0f,  1.0f, 0, 1.f, 1.f,
        };

        private FloatBuffer mTriangleVertices;

        private static final String VERTEX_SHADER =
                "uniform mat4 uMVPMatrix;\n" +
                        "uniform mat4 uSTMatrix;\n" +
                        "attribute vec4 aPosition;\n" +
                        "attribute vec4 aTextureCoord;\n" +
                        "varying vec2 vTextureCoord;\n" +
                        "void main() {\n" +
                        "    gl_Position = uMVPMatrix * aPosition;\n" +
                        "    vTextureCoord = (uSTMatrix * aTextureCoord).xy;\n" +
                        "}\n";

        private static final String FRAGMENT_SHADER =
                "#extension GL_OES_EGL_image_external : require\n" +
                        "precision mediump float;\n" +      // highp here doesn't seem to matter
                        "varying vec2 vTextureCoord;\n" +
                        "uniform samplerExternalOES sTexture;\n" +
                        "void main() {\n" +
                        "    gl_FragColor = texture2D(sTexture, vTextureCoord);\n" +
                        "}\n";

        private float[] mMVPMatrix = new float[16];
        private float[] mSTMatrix = new float[16];

        private int mProgram;
        private int mTextureID = -12345;
        private int muMVPMatrixHandle;
        private int muSTMatrixHandle;
        private int maPositionHandle;
        private int maTextureHandle;

        public STextureRender() {
            mTriangleVertices = ByteBuffer.allocateDirect(
                    mTriangleVerticesData.length * FLOAT_SIZE_BYTES)
                    .order(ByteOrder.nativeOrder()).asFloatBuffer();
            mTriangleVertices.put(mTriangleVerticesData).position(0);

            Matrix.setIdentityM(mSTMatrix, 0);
        }

        public int getTextureId() {
            return mTextureID;
        }

        /**
         * Draws the external texture in SurfaceTexture onto the current EGL surface.
         */
        public void drawFrame(SurfaceTexture st, boolean invert) {
            checkGlError("onDrawFrame start");
            st.getTransformMatrix(mSTMatrix);
            if (invert) {
                mSTMatrix[5] = -mSTMatrix[5];
                mSTMatrix[13] = 1.0f - mSTMatrix[13];
            }

            // (optional) clear to green so we can see if we're failing to set pixels
            GLES20.glClearColor(0.0f, 1.0f, 0.0f, 1.0f);
            GLES20.glClear(GLES20.GL_COLOR_BUFFER_BIT);

            GLES20.glUseProgram(mProgram);
            checkGlError("glUseProgram");

            GLES20.glActiveTexture(GLES20.GL_TEXTURE0);
            GLES20.glBindTexture(GLES11Ext.GL_TEXTURE_EXTERNAL_OES, mTextureID);

            mTriangleVertices.position(TRIANGLE_VERTICES_DATA_POS_OFFSET);
            GLES20.glVertexAttribPointer(maPositionHandle, 3, GLES20.GL_FLOAT, false,
                    TRIANGLE_VERTICES_DATA_STRIDE_BYTES, mTriangleVertices);
            checkGlError("glVertexAttribPointer maPosition");
            GLES20.glEnableVertexAttribArray(maPositionHandle);
            checkGlError("glEnableVertexAttribArray maPositionHandle");

            mTriangleVertices.position(TRIANGLE_VERTICES_DATA_UV_OFFSET);
            GLES20.glVertexAttribPointer(maTextureHandle, 2, GLES20.GL_FLOAT, false,
                    TRIANGLE_VERTICES_DATA_STRIDE_BYTES, mTriangleVertices);
            checkGlError("glVertexAttribPointer maTextureHandle");
            GLES20.glEnableVertexAttribArray(maTextureHandle);
            checkGlError("glEnableVertexAttribArray maTextureHandle");

            Matrix.setIdentityM(mMVPMatrix, 0);
            GLES20.glUniformMatrix4fv(muMVPMatrixHandle, 1, false, mMVPMatrix, 0);
            GLES20.glUniformMatrix4fv(muSTMatrixHandle, 1, false, mSTMatrix, 0);

            GLES20.glDrawArrays(GLES20.GL_TRIANGLE_STRIP, 0, 4);
            checkGlError("glDrawArrays");

            GLES20.glBindTexture(GLES11Ext.GL_TEXTURE_EXTERNAL_OES, 0);
        }

        /**
         * Initializes GL state.  Call this after the EGL surface has been created and made current.
         */
        public void surfaceCreated() {
            mProgram = createProgram(VERTEX_SHADER, FRAGMENT_SHADER);
            if (mProgram == 0) {
                throw new RuntimeException("failed creating program");
            }

            maPositionHandle = GLES20.glGetAttribLocation(mProgram, "aPosition");
            checkLocation(maPositionHandle, "aPosition");
            maTextureHandle = GLES20.glGetAttribLocation(mProgram, "aTextureCoord");
            checkLocation(maTextureHandle, "aTextureCoord");

            muMVPMatrixHandle = GLES20.glGetUniformLocation(mProgram, "uMVPMatrix");
            checkLocation(muMVPMatrixHandle, "uMVPMatrix");
            muSTMatrixHandle = GLES20.glGetUniformLocation(mProgram, "uSTMatrix");
            checkLocation(muSTMatrixHandle, "uSTMatrix");

            int[] textures = new int[1];
            GLES20.glGenTextures(1, textures, 0);

            mTextureID = textures[0];
            GLES20.glBindTexture(GLES11Ext.GL_TEXTURE_EXTERNAL_OES, mTextureID);
            checkGlError("glBindTexture mTextureID");

            GLES20.glTexParameterf(GLES11Ext.GL_TEXTURE_EXTERNAL_OES, GLES20.GL_TEXTURE_MIN_FILTER,
                    GLES20.GL_NEAREST);
            GLES20.glTexParameterf(GLES11Ext.GL_TEXTURE_EXTERNAL_OES, GLES20.GL_TEXTURE_MAG_FILTER,
                    GLES20.GL_LINEAR);
            GLES20.glTexParameteri(GLES11Ext.GL_TEXTURE_EXTERNAL_OES, GLES20.GL_TEXTURE_WRAP_S,
                    GLES20.GL_CLAMP_TO_EDGE);
            GLES20.glTexParameteri(GLES11Ext.GL_TEXTURE_EXTERNAL_OES, GLES20.GL_TEXTURE_WRAP_T,
                    GLES20.GL_CLAMP_TO_EDGE);
            checkGlError("glTexParameter");
        }

        /**
         * Replaces the fragment shader.  Pass in null to reset to default.
         */
        public void changeFragmentShader(String fragmentShader) {
            if (fragmentShader == null) {
                fragmentShader = FRAGMENT_SHADER;
            }
            GLES20.glDeleteProgram(mProgram);
            mProgram = createProgram(VERTEX_SHADER, fragmentShader);
            if (mProgram == 0) {
                throw new RuntimeException("failed creating program");
            }
        }

        private int loadShader(int shaderType, String source) {
            int shader = GLES20.glCreateShader(shaderType);
            checkGlError("glCreateShader type=" + shaderType);
            GLES20.glShaderSource(shader, source);
            GLES20.glCompileShader(shader);
            int[] compiled = new int[1];
            GLES20.glGetShaderiv(shader, GLES20.GL_COMPILE_STATUS, compiled, 0);
            if (compiled[0] == 0) {
                Log.e(TAG, "Could not compile shader " + shaderType + ":");
                Log.e(TAG, " " + GLES20.glGetShaderInfoLog(shader));
                GLES20.glDeleteShader(shader);
                shader = 0;
            }
            return shader;
        }

        private int createProgram(String vertexSource, String fragmentSource) {
            int vertexShader = loadShader(GLES20.GL_VERTEX_SHADER, vertexSource);
            if (vertexShader == 0) {
                return 0;
            }
            int pixelShader = loadShader(GLES20.GL_FRAGMENT_SHADER, fragmentSource);
            if (pixelShader == 0) {
                return 0;
            }

            int program = GLES20.glCreateProgram();
            if (program == 0) {
                Log.e(TAG, "Could not create program");
            }
            GLES20.glAttachShader(program, vertexShader);
            checkGlError("glAttachShader");
            GLES20.glAttachShader(program, pixelShader);
            checkGlError("glAttachShader");
            GLES20.glLinkProgram(program);
            int[] linkStatus = new int[1];
            GLES20.glGetProgramiv(program, GLES20.GL_LINK_STATUS, linkStatus, 0);
            if (linkStatus[0] != GLES20.GL_TRUE) {
                Log.e(TAG, "Could not link program: ");
                Log.e(TAG, GLES20.glGetProgramInfoLog(program));
                GLES20.glDeleteProgram(program);
                program = 0;
            }
            return program;
        }

        public void checkGlError(String op) {
            int error;
            while ((error = GLES20.glGetError()) != GLES20.GL_NO_ERROR) {
                Log.e(TAG, op + ": glError " + error);
                throw new RuntimeException(op + ": glError " + error);
            }
        }

        public static void checkLocation(int location, String label) {
            if (location < 0) {
                throw new RuntimeException("Unable to locate '" + label + "' in program");
            }
        }
    }
}

生成视频缩略图

方法一

使用MediaMetadataRetriever 的getFrameAtTime()方法, 如下:

private Bitmap createThumbnailAtTime(String filePath, int timeInSeconds){
    MediaMetadataRetriever mMMR = new MediaMetadataRetriever();
    mMMR.setDataSource(filePath);
    //api time unit is microseconds
    return mMMR.getFrameAtTime(timeInSeconds*1000000, MediaMetadataRetriever.OPTION_CLOSEST);
}

getFrameAtTime方法有几个重载形式, 其中type参数有如下几种取值:

public static final int OPTION_PREVIOUS_SYNC  
public static final int OPTION_NEXT_SYNC 
public static final int OPTION_CLOSEST_SYNC
public static final int OPTION_CLOSEST

可以根据需要选择是否选取关键帧. 以及优先取前面的还是后面的. 还提供了指定尺寸的重载方法.

这种方法效率最高, 速度最快!

需要注意的是, 这个办法对很多格式不规整的视频文件不能很好的支持, 有些文件只有一个关键帧, 或者关键帧很少. 导致很难获取到正确的截图.

方法二

使用MediaExtractor,MediaCodec和opengl抽取mp4视频文件的帧缩略图.这个方法速度中等, 但是对视频文件的格式支持兼容性较好, 只要能播放, 基本都能搞定.

方法三

将视频渲染到TextureView上, 然后通过将TextureView内容转换为bitmap来获得指定时间点的截图.这个方法效率最低, 必须要把视频在TextureView上播放出来, 才能截图. 虽然可以通过调节播放倍速来加快截图效率, 但是这种方式仍然是最慢的.

方法四

通过ffmpeg实现.