linux-staging.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 00/11] VP9 codec V4L2 control interface
@ 2021-09-29 16:04 Andrzej Pietrasiewicz
  2021-09-29 16:04 ` [PATCH v7 01/11] hantro: postproc: Fix motion vector space size Andrzej Pietrasiewicz
                   ` (13 more replies)
  0 siblings, 14 replies; 37+ messages in thread
From: Andrzej Pietrasiewicz @ 2021-09-29 16:04 UTC (permalink / raw)
  To: linux-media, linux-arm-kernel, linux-kernel, linux-rockchip,
	linux-staging
  Cc: Andrzej Pietrasiewicz, Benjamin Gaignard, Boris Brezillon,
	Ezequiel Garcia, Fabio Estevam, Greg Kroah-Hartman, Hans Verkuil,
	Heiko Stuebner, Jernej Skrabec, Mauro Carvalho Chehab,
	Nicolas Dufresne, NXP Linux Team, Pengutronix Kernel Team,
	Philipp Zabel, Sascha Hauer, Shawn Guo, kernel

Dear all,

This patch series adds VP9 codec V4L2 control interface and two drivers
using the new controls. It is a follow-up of previous v6 series [1].

In this iteration, we've implemented VP9 hardware decoding on two devices:
Rockchip VDEC and Hantro G2, and tested on RK3399, i.MX8MQ and i.MX8MP.
The i.MX8M driver needs proper power domains support, though, which is a
subject of a different effort, but in all 3 cases we were able to run the
drivers.

GStreamer support is also available, the needed changes have been submitted
by Daniel Almeida [2]. This MR is ready to be merged, and just needs the
VP9 V4L2 controls to be merged and released.

Both rkvdec and hantro drivers are passing a significant number of VP9 tests
using Fluster[3]. There are still a few tests that are not passing, due to
dynamic frame resize (not yet supported by V4L2) and small size videos
(due to IP block limitations).

The series adds the VP9 codec V4L2 control API as uAPI, so it aims at being
merged without passing through staging, as agreed[4]. The ABI has been checked
for padding and verified to contain no holes.

[1] https://patchwork.linuxtv.org/project/linux-media/list/?series=6377
[2] https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2144
[3] https://github.com/fluendo/fluster
[4] https://lore.kernel.org/linux-media/b8f83c93-67fd-09f5-9314-15746cbfdc61@xs4all.nl/

The series depends on the YUV tiled format support prepared by Ezequiel:
https://www.spinics.net/lists/linux-media/msg197047.html

Rebased onto latest media_tree.

Changes related to v6:
- moved setting tile filter and tile bsd auxiliary buffer addresses so
that they are always set, even if no tiles are used (thanks, Jernej)
- added a comment near the place where the 32-bit DMA mask is applied
  (thanks, Nicolas)
- improved consistency in register names (thanks, Nicolas)

Changes related to v5:
- improved the doc comments as per Ezequiel's review (thanks, Ezequiel)
- improved pdf output of documentation
- added Benjamin's Reviewed-by (thanks, Benjamin)

Changes related to v4:
- removed unused enum v4l2_vp9_intra_prediction_mode
- converted remaining enums to defines to follow the convention
- improved the documentation, in particular better documented how to use segmentation 
features

Changes related to v3:

Apply suggestions from Jernej's review (thanks, Jernej):
- renamed a control and two structs:
	V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR_PROBS =>
		V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR
	v4l2_ctrl_vp9_compressed_hdr_probs =>
		v4l2_ctrl_vp9_compressed_hdr
	v4l2_vp9_mv_compressed_hdr_probs => v4l2_vp9_mv_probs
- moved tx_mode to v4l2_ctrl_vp9_compressed_hdr
- fixed enum v4l2_vp9_ref_frame_sign_bias values (which are used to test a bitfield)
- explicitly assigned values to all other vp9 enums

Apply suggestion from Nicolas's review (thanks, Nicolas):
- explicitly stated that the v4l2_ctrl_vp9_compressed_hdr control is optional
and implemented only by drivers which need it

Changes related to the RFC v2:

- added another driver including a postprocessor to de-tile
        codec-specific tiling
- reworked uAPI structs layout to follow VP8 style
- changed validation of loop filter params
- changed validation of segmentation params
- changed validation of VP9 frame params
- removed level lookup array from loop filter struct
        (can be computed by drivers)
- renamed some enum values to match the spec more closely
- V4L2 VP9 library changed the 'eob' member of
        'struct v4l2_vp9_frame_symbol_counts' so that it is an array
        of pointers instead of an array of pointers to arrays
        (IPs such as g2 creatively pass parts of the 'eob' counts in
        the 'coeff' counts)
- factored out several repeated portions of code
- minor nitpicks and cleanups

Andrzej Pietrasiewicz (6):
  media: uapi: Add VP9 stateless decoder controls
  media: Add VP9 v4l2 library
  media: hantro: Rename registers
  media: hantro: Prepare for other G2 codecs
  media: hantro: Support VP9 on the G2 core
  media: hantro: Support NV12 on the G2 core

Boris Brezillon (1):
  media: rkvdec: Add the VP9 backend

Ezequiel Garcia (4):
  hantro: postproc: Fix motion vector space size
  hantro: postproc: Introduce struct hantro_postproc_ops
  hantro: Simplify postprocessor
  hantro: Add quirk for NV12/NV12_4L4 capture format

 .../userspace-api/media/v4l/biblio.rst        |   10 +
 .../media/v4l/ext-ctrls-codec-stateless.rst   |  573 +++++
 .../media/v4l/pixfmt-compressed.rst           |   15 +
 .../media/v4l/vidioc-g-ext-ctrls.rst          |    8 +
 .../media/v4l/vidioc-queryctrl.rst            |   12 +
 .../media/videodev2.h.rst.exceptions          |    2 +
 drivers/media/v4l2-core/Kconfig               |    4 +
 drivers/media/v4l2-core/Makefile              |    1 +
 drivers/media/v4l2-core/v4l2-ctrls-core.c     |  180 ++
 drivers/media/v4l2-core/v4l2-ctrls-defs.c     |    8 +
 drivers/media/v4l2-core/v4l2-ioctl.c          |    1 +
 drivers/media/v4l2-core/v4l2-vp9.c            | 1850 +++++++++++++++++
 drivers/staging/media/hantro/Kconfig          |    1 +
 drivers/staging/media/hantro/Makefile         |    7 +-
 drivers/staging/media/hantro/hantro.h         |   40 +-
 drivers/staging/media/hantro/hantro_drv.c     |   23 +-
 drivers/staging/media/hantro/hantro_g2.c      |   27 +
 .../staging/media/hantro/hantro_g2_hevc_dec.c |   69 +-
 drivers/staging/media/hantro/hantro_g2_regs.h |  132 +-
 .../staging/media/hantro/hantro_g2_vp9_dec.c  |  980 +++++++++
 drivers/staging/media/hantro/hantro_hw.h      |   83 +-
 .../staging/media/hantro/hantro_postproc.c    |   79 +-
 drivers/staging/media/hantro/hantro_v4l2.c    |   20 +
 drivers/staging/media/hantro/hantro_vp9.c     |  240 +++
 drivers/staging/media/hantro/hantro_vp9.h     |  103 +
 drivers/staging/media/hantro/imx8m_vpu_hw.c   |   38 +-
 .../staging/media/hantro/rockchip_vpu_hw.c    |    7 +-
 .../staging/media/hantro/sama5d4_vdec_hw.c    |    3 +-
 drivers/staging/media/rkvdec/Kconfig          |    1 +
 drivers/staging/media/rkvdec/Makefile         |    2 +-
 drivers/staging/media/rkvdec/rkvdec-vp9.c     | 1078 ++++++++++
 drivers/staging/media/rkvdec/rkvdec.c         |   52 +-
 drivers/staging/media/rkvdec/rkvdec.h         |   12 +-
 include/media/v4l2-ctrls.h                    |    4 +
 include/media/v4l2-vp9.h                      |  182 ++
 include/uapi/linux/v4l2-controls.h            |  284 +++
 include/uapi/linux/videodev2.h                |    6 +
 37 files changed, 6033 insertions(+), 104 deletions(-)
 create mode 100644 drivers/media/v4l2-core/v4l2-vp9.c
 create mode 100644 drivers/staging/media/hantro/hantro_g2.c
 create mode 100644 drivers/staging/media/hantro/hantro_g2_vp9_dec.c
 create mode 100644 drivers/staging/media/hantro/hantro_vp9.c
 create mode 100644 drivers/staging/media/hantro/hantro_vp9.h
 create mode 100644 drivers/staging/media/rkvdec/rkvdec-vp9.c
 create mode 100644 include/media/v4l2-vp9.h


base-commit: e4e737bb5c170df6135a127739a9e6148ee3da82
-- 
2.17.1


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v7 01/11] hantro: postproc: Fix motion vector space size
  2021-09-29 16:04 [PATCH v7 00/11] VP9 codec V4L2 control interface Andrzej Pietrasiewicz
@ 2021-09-29 16:04 ` Andrzej Pietrasiewicz
  2021-09-29 16:04 ` [PATCH v7 02/11] hantro: postproc: Introduce struct hantro_postproc_ops Andrzej Pietrasiewicz
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 37+ messages in thread
From: Andrzej Pietrasiewicz @ 2021-09-29 16:04 UTC (permalink / raw)
  To: linux-media, linux-arm-kernel, linux-kernel, linux-rockchip,
	linux-staging
  Cc: Andrzej Pietrasiewicz, Benjamin Gaignard, Boris Brezillon,
	Ezequiel Garcia, Fabio Estevam, Greg Kroah-Hartman, Hans Verkuil,
	Heiko Stuebner, Jernej Skrabec, Mauro Carvalho Chehab,
	Nicolas Dufresne, NXP Linux Team, Pengutronix Kernel Team,
	Philipp Zabel, Sascha Hauer, Shawn Guo, kernel, Ezequiel Garcia

From: Ezequiel Garcia <ezequiel@collabora.com>

When the post-processor hardware block is enabled, the driver
allocates an internal queue of buffers for the decoder enginer,
and uses the vb2 queue for the post-processor engine.

For instance, on a G1 core, the decoder engine produces NV12 buffers
and the post-processor engine can produce YUY2 buffers. The decoder
engine expects motion vectors to be appended to the NV12 buffers,
but this is only required for CODECs that need motion vectors,
such as H.264.

Fix the post-processor logic accordingly.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
---
 drivers/staging/media/hantro/hantro_postproc.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_postproc.c b/drivers/staging/media/hantro/hantro_postproc.c
index ed8916c950a4..07842152003f 100644
--- a/drivers/staging/media/hantro/hantro_postproc.c
+++ b/drivers/staging/media/hantro/hantro_postproc.c
@@ -132,9 +132,10 @@ int hantro_postproc_alloc(struct hantro_ctx *ctx)
 	unsigned int num_buffers = cap_queue->num_buffers;
 	unsigned int i, buf_size;
 
-	buf_size = ctx->dst_fmt.plane_fmt[0].sizeimage +
-		   hantro_h264_mv_size(ctx->dst_fmt.width,
-				       ctx->dst_fmt.height);
+	buf_size = ctx->dst_fmt.plane_fmt[0].sizeimage;
+	if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_H264_SLICE)
+		buf_size += hantro_h264_mv_size(ctx->dst_fmt.width,
+						ctx->dst_fmt.height);
 
 	for (i = 0; i < num_buffers; ++i) {
 		struct hantro_aux_buf *priv = &ctx->postproc.dec_q[i];
-- 
2.17.1


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v7 02/11] hantro: postproc: Introduce struct hantro_postproc_ops
  2021-09-29 16:04 [PATCH v7 00/11] VP9 codec V4L2 control interface Andrzej Pietrasiewicz
  2021-09-29 16:04 ` [PATCH v7 01/11] hantro: postproc: Fix motion vector space size Andrzej Pietrasiewicz
@ 2021-09-29 16:04 ` Andrzej Pietrasiewicz
  2021-09-29 16:04 ` [PATCH v7 03/11] hantro: Simplify postprocessor Andrzej Pietrasiewicz
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 37+ messages in thread
From: Andrzej Pietrasiewicz @ 2021-09-29 16:04 UTC (permalink / raw)
  To: linux-media, linux-arm-kernel, linux-kernel, linux-rockchip,
	linux-staging
  Cc: Andrzej Pietrasiewicz, Benjamin Gaignard, Boris Brezillon,
	Ezequiel Garcia, Fabio Estevam, Greg Kroah-Hartman, Hans Verkuil,
	Heiko Stuebner, Jernej Skrabec, Mauro Carvalho Chehab,
	Nicolas Dufresne, NXP Linux Team, Pengutronix Kernel Team,
	Philipp Zabel, Sascha Hauer, Shawn Guo, kernel, Ezequiel Garcia

From: Ezequiel Garcia <ezequiel@collabora.com>

Turns out the post-processor block on the G2 core is substantially
different from the one on the G1 core. Introduce hantro_postproc_ops
with .enable and .disable methods, which will allow to support
the G2 post-processor cleanly.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
Reviewed-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
 drivers/staging/media/hantro/hantro.h         |  5 +--
 drivers/staging/media/hantro/hantro_hw.h      | 13 +++++++-
 .../staging/media/hantro/hantro_postproc.c    | 33 ++++++++++++++-----
 drivers/staging/media/hantro/imx8m_vpu_hw.c   |  2 +-
 .../staging/media/hantro/rockchip_vpu_hw.c    |  6 ++--
 .../staging/media/hantro/sama5d4_vdec_hw.c    |  2 +-
 6 files changed, 44 insertions(+), 17 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
index c2e2dca38628..c2e01959dc00 100644
--- a/drivers/staging/media/hantro/hantro.h
+++ b/drivers/staging/media/hantro/hantro.h
@@ -28,6 +28,7 @@
 
 struct hantro_ctx;
 struct hantro_codec_ops;
+struct hantro_postproc_ops;
 
 #define HANTRO_JPEG_ENCODER	BIT(0)
 #define HANTRO_ENCODERS		0x0000ffff
@@ -59,6 +60,7 @@ struct hantro_irq {
  * @num_dec_fmts:		Number of decoder formats.
  * @postproc_fmts:		Post-processor formats.
  * @num_postproc_fmts:		Number of post-processor formats.
+ * @postproc_ops:		Post-processor ops.
  * @codec:			Supported codecs
  * @codec_ops:			Codec ops.
  * @init:			Initialize hardware, optional.
@@ -69,7 +71,6 @@ struct hantro_irq {
  * @num_clocks:			number of clocks in the array
  * @reg_names:			array of register range names
  * @num_regs:			number of register range names in the array
- * @postproc_regs:		&struct hantro_postproc_regs pointer
  */
 struct hantro_variant {
 	unsigned int enc_offset;
@@ -80,6 +81,7 @@ struct hantro_variant {
 	unsigned int num_dec_fmts;
 	const struct hantro_fmt *postproc_fmts;
 	unsigned int num_postproc_fmts;
+	const struct hantro_postproc_ops *postproc_ops;
 	unsigned int codec;
 	const struct hantro_codec_ops *codec_ops;
 	int (*init)(struct hantro_dev *vpu);
@@ -90,7 +92,6 @@ struct hantro_variant {
 	int num_clocks;
 	const char * const *reg_names;
 	int num_regs;
-	const struct hantro_postproc_regs *postproc_regs;
 };
 
 /**
diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
index df7b5e3a57b9..4323e63dfbfc 100644
--- a/drivers/staging/media/hantro/hantro_hw.h
+++ b/drivers/staging/media/hantro/hantro_hw.h
@@ -170,6 +170,17 @@ struct hantro_postproc_ctx {
 	struct hantro_aux_buf dec_q[VB2_MAX_FRAME];
 };
 
+/**
+ * struct hantro_postproc_ops - post-processor operations
+ *
+ * @enable:	Enable the post-processor block. Optional.
+ * @disable:	Disable the post-processor block. Optional.
+ */
+struct hantro_postproc_ops {
+	void (*enable)(struct hantro_ctx *ctx);
+	void (*disable)(struct hantro_ctx *ctx);
+};
+
 /**
  * struct hantro_codec_ops - codec mode specific operations
  *
@@ -217,7 +228,7 @@ extern const struct hantro_variant rk3328_vpu_variant;
 extern const struct hantro_variant rk3399_vpu_variant;
 extern const struct hantro_variant sama5d4_vdec_variant;
 
-extern const struct hantro_postproc_regs hantro_g1_postproc_regs;
+extern const struct hantro_postproc_ops hantro_g1_postproc_ops;
 
 extern const u32 hantro_vp8_dec_mc_filter[8][6];
 
diff --git a/drivers/staging/media/hantro/hantro_postproc.c b/drivers/staging/media/hantro/hantro_postproc.c
index 07842152003f..882fb8bc5ddd 100644
--- a/drivers/staging/media/hantro/hantro_postproc.c
+++ b/drivers/staging/media/hantro/hantro_postproc.c
@@ -15,14 +15,14 @@
 #define HANTRO_PP_REG_WRITE(vpu, reg_name, val) \
 { \
 	hantro_reg_write(vpu, \
-			 &(vpu)->variant->postproc_regs->reg_name, \
+			 &hantro_g1_postproc_regs.reg_name, \
 			 val); \
 }
 
 #define HANTRO_PP_REG_WRITE_S(vpu, reg_name, val) \
 { \
 	hantro_reg_write_s(vpu, \
-			   &(vpu)->variant->postproc_regs->reg_name, \
+			   &hantro_g1_postproc_regs.reg_name, \
 			   val); \
 }
 
@@ -64,16 +64,13 @@ bool hantro_needs_postproc(const struct hantro_ctx *ctx,
 	return fmt->fourcc != V4L2_PIX_FMT_NV12;
 }
 
-void hantro_postproc_enable(struct hantro_ctx *ctx)
+static void hantro_postproc_g1_enable(struct hantro_ctx *ctx)
 {
 	struct hantro_dev *vpu = ctx->dev;
 	struct vb2_v4l2_buffer *dst_buf;
 	u32 src_pp_fmt, dst_pp_fmt;
 	dma_addr_t dst_dma;
 
-	if (!vpu->variant->postproc_regs)
-		return;
-
 	/* Turn on pipeline mode. Must be done first. */
 	HANTRO_PP_REG_WRITE_S(vpu, pipeline_en, 0x1);
 
@@ -154,12 +151,30 @@ int hantro_postproc_alloc(struct hantro_ctx *ctx)
 	return 0;
 }
 
+static void hantro_postproc_g1_disable(struct hantro_ctx *ctx)
+{
+	struct hantro_dev *vpu = ctx->dev;
+
+	HANTRO_PP_REG_WRITE_S(vpu, pipeline_en, 0x0);
+}
+
 void hantro_postproc_disable(struct hantro_ctx *ctx)
 {
 	struct hantro_dev *vpu = ctx->dev;
 
-	if (!vpu->variant->postproc_regs)
-		return;
+	if (vpu->variant->postproc_ops && vpu->variant->postproc_ops->disable)
+		vpu->variant->postproc_ops->disable(ctx);
+}
 
-	HANTRO_PP_REG_WRITE_S(vpu, pipeline_en, 0x0);
+void hantro_postproc_enable(struct hantro_ctx *ctx)
+{
+	struct hantro_dev *vpu = ctx->dev;
+
+	if (vpu->variant->postproc_ops && vpu->variant->postproc_ops->enable)
+		vpu->variant->postproc_ops->enable(ctx);
 }
+
+const struct hantro_postproc_ops hantro_g1_postproc_ops = {
+	.enable = hantro_postproc_g1_enable,
+	.disable = hantro_postproc_g1_disable,
+};
diff --git a/drivers/staging/media/hantro/imx8m_vpu_hw.c b/drivers/staging/media/hantro/imx8m_vpu_hw.c
index ea919bfb9891..22fa7d2f3b64 100644
--- a/drivers/staging/media/hantro/imx8m_vpu_hw.c
+++ b/drivers/staging/media/hantro/imx8m_vpu_hw.c
@@ -262,7 +262,7 @@ const struct hantro_variant imx8mq_vpu_variant = {
 	.num_dec_fmts = ARRAY_SIZE(imx8m_vpu_dec_fmts),
 	.postproc_fmts = imx8m_vpu_postproc_fmts,
 	.num_postproc_fmts = ARRAY_SIZE(imx8m_vpu_postproc_fmts),
-	.postproc_regs = &hantro_g1_postproc_regs,
+	.postproc_ops = &hantro_g1_postproc_ops,
 	.codec = HANTRO_MPEG2_DECODER | HANTRO_VP8_DECODER |
 		 HANTRO_H264_DECODER,
 	.codec_ops = imx8mq_vpu_codec_ops,
diff --git a/drivers/staging/media/hantro/rockchip_vpu_hw.c b/drivers/staging/media/hantro/rockchip_vpu_hw.c
index d4f52957cc53..6c1ad5534ce5 100644
--- a/drivers/staging/media/hantro/rockchip_vpu_hw.c
+++ b/drivers/staging/media/hantro/rockchip_vpu_hw.c
@@ -460,7 +460,7 @@ const struct hantro_variant rk3036_vpu_variant = {
 	.num_dec_fmts = ARRAY_SIZE(rk3066_vpu_dec_fmts),
 	.postproc_fmts = rockchip_vpu1_postproc_fmts,
 	.num_postproc_fmts = ARRAY_SIZE(rockchip_vpu1_postproc_fmts),
-	.postproc_regs = &hantro_g1_postproc_regs,
+	.postproc_ops = &hantro_g1_postproc_ops,
 	.codec = HANTRO_MPEG2_DECODER | HANTRO_VP8_DECODER |
 		 HANTRO_H264_DECODER,
 	.codec_ops = rk3036_vpu_codec_ops,
@@ -485,7 +485,7 @@ const struct hantro_variant rk3066_vpu_variant = {
 	.num_dec_fmts = ARRAY_SIZE(rk3066_vpu_dec_fmts),
 	.postproc_fmts = rockchip_vpu1_postproc_fmts,
 	.num_postproc_fmts = ARRAY_SIZE(rockchip_vpu1_postproc_fmts),
-	.postproc_regs = &hantro_g1_postproc_regs,
+	.postproc_ops = &hantro_g1_postproc_ops,
 	.codec = HANTRO_JPEG_ENCODER | HANTRO_MPEG2_DECODER |
 		 HANTRO_VP8_DECODER | HANTRO_H264_DECODER,
 	.codec_ops = rk3066_vpu_codec_ops,
@@ -505,7 +505,7 @@ const struct hantro_variant rk3288_vpu_variant = {
 	.num_dec_fmts = ARRAY_SIZE(rk3288_vpu_dec_fmts),
 	.postproc_fmts = rockchip_vpu1_postproc_fmts,
 	.num_postproc_fmts = ARRAY_SIZE(rockchip_vpu1_postproc_fmts),
-	.postproc_regs = &hantro_g1_postproc_regs,
+	.postproc_ops = &hantro_g1_postproc_ops,
 	.codec = HANTRO_JPEG_ENCODER | HANTRO_MPEG2_DECODER |
 		 HANTRO_VP8_DECODER | HANTRO_H264_DECODER,
 	.codec_ops = rk3288_vpu_codec_ops,
diff --git a/drivers/staging/media/hantro/sama5d4_vdec_hw.c b/drivers/staging/media/hantro/sama5d4_vdec_hw.c
index 9c3b8cd0b239..f3fecc7248c4 100644
--- a/drivers/staging/media/hantro/sama5d4_vdec_hw.c
+++ b/drivers/staging/media/hantro/sama5d4_vdec_hw.c
@@ -100,7 +100,7 @@ const struct hantro_variant sama5d4_vdec_variant = {
 	.num_dec_fmts = ARRAY_SIZE(sama5d4_vdec_fmts),
 	.postproc_fmts = sama5d4_vdec_postproc_fmts,
 	.num_postproc_fmts = ARRAY_SIZE(sama5d4_vdec_postproc_fmts),
-	.postproc_regs = &hantro_g1_postproc_regs,
+	.postproc_ops = &hantro_g1_postproc_ops,
 	.codec = HANTRO_MPEG2_DECODER | HANTRO_VP8_DECODER |
 		 HANTRO_H264_DECODER,
 	.codec_ops = sama5d4_vdec_codec_ops,
-- 
2.17.1


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v7 03/11] hantro: Simplify postprocessor
  2021-09-29 16:04 [PATCH v7 00/11] VP9 codec V4L2 control interface Andrzej Pietrasiewicz
  2021-09-29 16:04 ` [PATCH v7 01/11] hantro: postproc: Fix motion vector space size Andrzej Pietrasiewicz
  2021-09-29 16:04 ` [PATCH v7 02/11] hantro: postproc: Introduce struct hantro_postproc_ops Andrzej Pietrasiewicz
@ 2021-09-29 16:04 ` Andrzej Pietrasiewicz
  2021-09-29 16:04 ` [PATCH v7 04/11] hantro: Add quirk for NV12/NV12_4L4 capture format Andrzej Pietrasiewicz
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 37+ messages in thread
From: Andrzej Pietrasiewicz @ 2021-09-29 16:04 UTC (permalink / raw)
  To: linux-media, linux-arm-kernel, linux-kernel, linux-rockchip,
	linux-staging
  Cc: Andrzej Pietrasiewicz, Benjamin Gaignard, Boris Brezillon,
	Ezequiel Garcia, Fabio Estevam, Greg Kroah-Hartman, Hans Verkuil,
	Heiko Stuebner, Jernej Skrabec, Mauro Carvalho Chehab,
	Nicolas Dufresne, NXP Linux Team, Pengutronix Kernel Team,
	Philipp Zabel, Sascha Hauer, Shawn Guo, kernel, Ezequiel Garcia

From: Ezequiel Garcia <ezequiel@collabora.com>

Add a 'postprocessed' boolean property to struct hantro_fmt
to signal that a format is produced by the post-processor.
This will allow to introduce the G2 post-processor in a simple way.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
---
 drivers/staging/media/hantro/hantro.h          | 2 ++
 drivers/staging/media/hantro/hantro_postproc.c | 8 +-------
 drivers/staging/media/hantro/imx8m_vpu_hw.c    | 1 +
 drivers/staging/media/hantro/rockchip_vpu_hw.c | 1 +
 drivers/staging/media/hantro/sama5d4_vdec_hw.c | 1 +
 5 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
index c2e01959dc00..dd5e56765d4e 100644
--- a/drivers/staging/media/hantro/hantro.h
+++ b/drivers/staging/media/hantro/hantro.h
@@ -263,6 +263,7 @@ struct hantro_ctx {
  * @max_depth:	Maximum depth, for bitstream formats
  * @enc_fmt:	Format identifier for encoder registers.
  * @frmsize:	Supported range of frame sizes (only for bitstream formats).
+ * @postprocessed: Indicates if this format needs the post-processor.
  */
 struct hantro_fmt {
 	char *name;
@@ -272,6 +273,7 @@ struct hantro_fmt {
 	int max_depth;
 	enum hantro_enc_fmt enc_fmt;
 	struct v4l2_frmsize_stepwise frmsize;
+	bool postprocessed;
 };
 
 struct hantro_reg {
diff --git a/drivers/staging/media/hantro/hantro_postproc.c b/drivers/staging/media/hantro/hantro_postproc.c
index 882fb8bc5ddd..4549aec08feb 100644
--- a/drivers/staging/media/hantro/hantro_postproc.c
+++ b/drivers/staging/media/hantro/hantro_postproc.c
@@ -53,15 +53,9 @@ const struct hantro_postproc_regs hantro_g1_postproc_regs = {
 bool hantro_needs_postproc(const struct hantro_ctx *ctx,
 			   const struct hantro_fmt *fmt)
 {
-	struct hantro_dev *vpu = ctx->dev;
-
 	if (ctx->is_encoder)
 		return false;
-
-	if (!vpu->variant->postproc_fmts)
-		return false;
-
-	return fmt->fourcc != V4L2_PIX_FMT_NV12;
+	return fmt->postprocessed;
 }
 
 static void hantro_postproc_g1_enable(struct hantro_ctx *ctx)
diff --git a/drivers/staging/media/hantro/imx8m_vpu_hw.c b/drivers/staging/media/hantro/imx8m_vpu_hw.c
index 22fa7d2f3b64..02e61438220a 100644
--- a/drivers/staging/media/hantro/imx8m_vpu_hw.c
+++ b/drivers/staging/media/hantro/imx8m_vpu_hw.c
@@ -82,6 +82,7 @@ static const struct hantro_fmt imx8m_vpu_postproc_fmts[] = {
 	{
 		.fourcc = V4L2_PIX_FMT_YUYV,
 		.codec_mode = HANTRO_MODE_NONE,
+		.postprocessed = true,
 	},
 };
 
diff --git a/drivers/staging/media/hantro/rockchip_vpu_hw.c b/drivers/staging/media/hantro/rockchip_vpu_hw.c
index 6c1ad5534ce5..f372f767d4ff 100644
--- a/drivers/staging/media/hantro/rockchip_vpu_hw.c
+++ b/drivers/staging/media/hantro/rockchip_vpu_hw.c
@@ -62,6 +62,7 @@ static const struct hantro_fmt rockchip_vpu1_postproc_fmts[] = {
 	{
 		.fourcc = V4L2_PIX_FMT_YUYV,
 		.codec_mode = HANTRO_MODE_NONE,
+		.postprocessed = true,
 	},
 };
 
diff --git a/drivers/staging/media/hantro/sama5d4_vdec_hw.c b/drivers/staging/media/hantro/sama5d4_vdec_hw.c
index f3fecc7248c4..b2fc1c5613e1 100644
--- a/drivers/staging/media/hantro/sama5d4_vdec_hw.c
+++ b/drivers/staging/media/hantro/sama5d4_vdec_hw.c
@@ -15,6 +15,7 @@ static const struct hantro_fmt sama5d4_vdec_postproc_fmts[] = {
 	{
 		.fourcc = V4L2_PIX_FMT_YUYV,
 		.codec_mode = HANTRO_MODE_NONE,
+		.postprocessed = true,
 	},
 };
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v7 04/11] hantro: Add quirk for NV12/NV12_4L4 capture format
  2021-09-29 16:04 [PATCH v7 00/11] VP9 codec V4L2 control interface Andrzej Pietrasiewicz
                   ` (2 preceding siblings ...)
  2021-09-29 16:04 ` [PATCH v7 03/11] hantro: Simplify postprocessor Andrzej Pietrasiewicz
@ 2021-09-29 16:04 ` Andrzej Pietrasiewicz
  2021-09-29 16:04 ` [PATCH v7 05/11] media: uapi: Add VP9 stateless decoder controls Andrzej Pietrasiewicz
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 37+ messages in thread
From: Andrzej Pietrasiewicz @ 2021-09-29 16:04 UTC (permalink / raw)
  To: linux-media, linux-arm-kernel, linux-kernel, linux-rockchip,
	linux-staging
  Cc: Andrzej Pietrasiewicz, Benjamin Gaignard, Boris Brezillon,
	Ezequiel Garcia, Fabio Estevam, Greg Kroah-Hartman, Hans Verkuil,
	Heiko Stuebner, Jernej Skrabec, Mauro Carvalho Chehab,
	Nicolas Dufresne, NXP Linux Team, Pengutronix Kernel Team,
	Philipp Zabel, Sascha Hauer, Shawn Guo, kernel, Ezequiel Garcia

From: Ezequiel Garcia <ezequiel@collabora.com>

The G2 core decoder engine produces NV12_4L4 format,
which is a simple NV12 4x4 tiled format. The driver currently
hides this format by always enabling the post-processor engine,
and therefore offering NV12 directly.

This is done without using the logic in hantro_postproc.c
and therefore makes it difficult to add VP9 cleanly.

Since fixing this is not easy, add a small quirk to force
NV12 if HEVC was configured, but otherwise declare NV12_4L4
as the pixel format in imx8mq_vpu_g2_variant.dec_fmts.

This will be used by the VP9 decoder which will be added soon.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
---
 drivers/staging/media/hantro/hantro_v4l2.c  | 14 ++++++++++++++
 drivers/staging/media/hantro/imx8m_vpu_hw.c |  2 +-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/media/hantro/hantro_v4l2.c b/drivers/staging/media/hantro/hantro_v4l2.c
index bcb0bdff4a9a..d1f060c55fed 100644
--- a/drivers/staging/media/hantro/hantro_v4l2.c
+++ b/drivers/staging/media/hantro/hantro_v4l2.c
@@ -150,6 +150,20 @@ static int vidioc_enum_fmt(struct file *file, void *priv,
 	unsigned int num_fmts, i, j = 0;
 	bool skip_mode_none;
 
+	/*
+	 * The HEVC decoder on the G2 core needs a little quirk to offer NV12
+	 * only on the capture side. Once the post-processor logic is used,
+	 * we will be able to expose NV12_4L4 and NV12 as the other cases,
+	 * and therefore remove this quirk.
+	 */
+	if (capture && ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_HEVC_SLICE) {
+		if (f->index == 0) {
+			f->pixelformat = V4L2_PIX_FMT_NV12;
+			return 0;
+		}
+		return -EINVAL;
+	}
+
 	/*
 	 * When dealing with an encoder:
 	 *  - on the capture side we want to filter out all MODE_NONE formats.
diff --git a/drivers/staging/media/hantro/imx8m_vpu_hw.c b/drivers/staging/media/hantro/imx8m_vpu_hw.c
index 02e61438220a..a40b161e5956 100644
--- a/drivers/staging/media/hantro/imx8m_vpu_hw.c
+++ b/drivers/staging/media/hantro/imx8m_vpu_hw.c
@@ -134,7 +134,7 @@ static const struct hantro_fmt imx8m_vpu_dec_fmts[] = {
 
 static const struct hantro_fmt imx8m_vpu_g2_dec_fmts[] = {
 	{
-		.fourcc = V4L2_PIX_FMT_NV12,
+		.fourcc = V4L2_PIX_FMT_NV12_4L4,
 		.codec_mode = HANTRO_MODE_NONE,
 	},
 	{
-- 
2.17.1


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v7 05/11] media: uapi: Add VP9 stateless decoder controls
  2021-09-29 16:04 [PATCH v7 00/11] VP9 codec V4L2 control interface Andrzej Pietrasiewicz
                   ` (3 preceding siblings ...)
  2021-09-29 16:04 ` [PATCH v7 04/11] hantro: Add quirk for NV12/NV12_4L4 capture format Andrzej Pietrasiewicz
@ 2021-09-29 16:04 ` Andrzej Pietrasiewicz
  2021-09-29 16:04 ` [PATCH v7 06/11] media: Add VP9 v4l2 library Andrzej Pietrasiewicz
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 37+ messages in thread
From: Andrzej Pietrasiewicz @ 2021-09-29 16:04 UTC (permalink / raw)
  To: linux-media, linux-arm-kernel, linux-kernel, linux-rockchip,
	linux-staging
  Cc: Andrzej Pietrasiewicz, Benjamin Gaignard, Boris Brezillon,
	Ezequiel Garcia, Fabio Estevam, Greg Kroah-Hartman, Hans Verkuil,
	Heiko Stuebner, Jernej Skrabec, Mauro Carvalho Chehab,
	Nicolas Dufresne, NXP Linux Team, Pengutronix Kernel Team,
	Philipp Zabel, Sascha Hauer, Shawn Guo, kernel, Ezequiel Garcia,
	Adrian Ratiu, Daniel Almeida

Add the VP9 stateless decoder controls plus the documentation that goes
with it.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Co-developed-by: Ezequiel Garcia <ezequiel@collabora.com>
Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
Co-developed-by: Daniel Almeida <daniel.almeida@collabora.com>
Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com>
---
 .../userspace-api/media/v4l/biblio.rst        |  10 +
 .../media/v4l/ext-ctrls-codec-stateless.rst   | 573 ++++++++++++++++++
 .../media/v4l/pixfmt-compressed.rst           |  15 +
 .../media/v4l/vidioc-g-ext-ctrls.rst          |   8 +
 .../media/v4l/vidioc-queryctrl.rst            |  12 +
 .../media/videodev2.h.rst.exceptions          |   2 +
 drivers/media/v4l2-core/v4l2-ctrls-core.c     | 180 ++++++
 drivers/media/v4l2-core/v4l2-ctrls-defs.c     |   8 +
 drivers/media/v4l2-core/v4l2-ioctl.c          |   1 +
 include/media/v4l2-ctrls.h                    |   4 +
 include/uapi/linux/v4l2-controls.h            | 284 +++++++++
 include/uapi/linux/videodev2.h                |   6 +
 12 files changed, 1103 insertions(+)

diff --git a/Documentation/userspace-api/media/v4l/biblio.rst b/Documentation/userspace-api/media/v4l/biblio.rst
index 7b8e6738ff9e..9cd18c153d19 100644
--- a/Documentation/userspace-api/media/v4l/biblio.rst
+++ b/Documentation/userspace-api/media/v4l/biblio.rst
@@ -417,3 +417,13 @@ VP8
 :title:     RFC 6386: "VP8 Data Format and Decoding Guide"
 
 :author:    J. Bankoski et al.
+
+.. _vp9:
+
+VP9
+===
+
+
+:title:     VP9 Bitstream & Decoding Process Specification
+
+:author:    Adrian Grange (Google), Peter de Rivaz (Argon Design), Jonathan Hunt (Argon Design)
diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-codec-stateless.rst b/Documentation/userspace-api/media/v4l/ext-ctrls-codec-stateless.rst
index 72f5e85b4f34..cc080c4257d0 100644
--- a/Documentation/userspace-api/media/v4l/ext-ctrls-codec-stateless.rst
+++ b/Documentation/userspace-api/media/v4l/ext-ctrls-codec-stateless.rst
@@ -1458,3 +1458,576 @@ FWHT Flags
 .. raw:: latex
 
     \normalsize
+
+.. _v4l2-codec-stateless-vp9:
+
+``V4L2_CID_STATELESS_VP9_COMPRESSED_HDR (struct)``
+    Stores VP9 probabilities updates as parsed from the current compressed frame
+    header. A value of zero in an array element means no update of the relevant
+    probability. Motion vector-related updates contain a new value or zero. All
+    other updates contain values translated with inv_map_table[] (see 6.3.5 in
+    :ref:`vp9`).
+
+.. c:type:: v4l2_ctrl_vp9_compressed_hdr
+
+.. tabularcolumns:: |p{1cm}|p{4.8cm}|p{11.4cm}|
+
+.. cssclass:: longtable
+
+.. flat-table:: struct v4l2_ctrl_vp9_compressed_hdr
+    :header-rows:  0
+    :stub-columns: 0
+    :widths:       1 1 2
+
+    * - __u8
+      - ``tx_mode``
+      - Specifies the TX mode. See :ref:`TX Mode <vp9_tx_mode>` for more details.
+    * - __u8
+      - ``tx8[2][1]``
+      - TX 8x8 probabilities delta.
+    * - __u8
+      - ``tx16[2][2]``
+      - TX 16x16 probabilities delta.
+    * - __u8
+      - ``tx32[2][3]``
+      - TX 32x32 probabilities delta.
+    * - __u8
+      - ``coef[4][2][2][6][6][3]``
+      - Coefficient probabilities delta.
+    * - __u8
+      - ``skip[3]``
+      - Skip probabilities delta.
+    * - __u8
+      - ``inter_mode[7][3]``
+      - Inter prediction mode probabilities delta.
+    * - __u8
+      - ``interp_filter[4][2]``
+      - Interpolation filter probabilities delta.
+    * - __u8
+      - ``is_inter[4]``
+      - Is inter-block probabilities delta.
+    * - __u8
+      - ``comp_mode[5]``
+      - Compound prediction mode probabilities delta.
+    * - __u8
+      - ``single_ref[5][2]``
+      - Single reference probabilities delta.
+    * - __u8
+      - ``comp_ref[5]``
+      - Compound reference probabilities delta.
+    * - __u8
+      - ``y_mode[4][9]``
+      - Y prediction mode probabilities delta.
+    * - __u8
+      - ``uv_mode[10][9]``
+      - UV prediction mode probabilities delta.
+    * - __u8
+      - ``partition[16][3]``
+      - Partition probabilities delta.
+    * - __u8
+      - ``mv.joint[3]``
+      - Motion vector joint probabilities delta.
+    * - __u8
+      - ``mv.sign[2]``
+      - Motion vector sign probabilities delta.
+    * - __u8
+      - ``mv.classes[2][10]``
+      - Motion vector class probabilities delta.
+    * - __u8
+      - ``mv.class0_bit[2]``
+      - Motion vector class0 bit probabilities delta.
+    * - __u8
+      - ``mv.bits[2][10]``
+      - Motion vector bits probabilities delta.
+    * - __u8
+      - ``mv.class0_fr[2][2][3]``
+      - Motion vector class0 fractional bit probabilities delta.
+    * - __u8
+      - ``mv.fr[2][3]``
+      - Motion vector fractional bit probabilities delta.
+    * - __u8
+      - ``mv.class0_hp[2]``
+      - Motion vector class0 high precision fractional bit probabilities delta.
+    * - __u8
+      - ``mv.hp[2]``
+      - Motion vector high precision fractional bit probabilities delta.
+
+.. _vp9_tx_mode:
+
+``TX Mode``
+
+.. tabularcolumns:: |p{6.5cm}|p{0.5cm}|p{10.3cm}|
+
+.. flat-table::
+    :header-rows:  0
+    :stub-columns: 0
+    :widths:       1 1 2
+
+    * - ``V4L2_VP9_TX_MODE_ONLY_4X4``
+      - 0
+      - Transform size is 4x4.
+    * - ``V4L2_VP9_TX_MODE_ALLOW_8X8``
+      - 1
+      - Transform size can be up to 8x8.
+    * - ``V4L2_VP9_TX_MODE_ALLOW_16X16``
+      - 2
+      - Transform size can be up to 16x16.
+    * - ``V4L2_VP9_TX_MODE_ALLOW_32X32``
+      - 3
+      - transform size can be up to 32x32.
+    * - ``V4L2_VP9_TX_MODE_SELECT``
+      - 4
+      - Bitstream contains the transform size for each block.
+
+See section '7.3.1 Tx mode semantics' of the :ref:`vp9` specification for more details.
+
+``V4L2_CID_STATELESS_VP9_FRAME (struct)``
+    Specifies the frame parameters for the associated VP9 frame decode request.
+    This includes the necessary parameters for configuring a stateless hardware
+    decoding pipeline for VP9. The bitstream parameters are defined according
+    to :ref:`vp9`.
+
+.. c:type:: v4l2_ctrl_vp9_frame
+
+.. raw:: latex
+
+    \small
+
+.. tabularcolumns:: |p{4.7cm}|p{5.5cm}|p{7.1cm}|
+
+.. cssclass:: longtable
+
+.. flat-table:: struct v4l2_ctrl_vp9_frame
+    :header-rows:  0
+    :stub-columns: 0
+    :widths:       1 1 2
+
+    * - struct :c:type:`v4l2_vp9_loop_filter`
+      - ``lf``
+      - Loop filter parameters. See struct :c:type:`v4l2_vp9_loop_filter` for more details.
+    * - struct :c:type:`v4l2_vp9_quantization`
+      - ``quant``
+      - Quantization parameters. See :c:type:`v4l2_vp9_quantization` for more details.
+    * - struct :c:type:`v4l2_vp9_segmentation`
+      - ``seg``
+      - Segmentation parameters. See :c:type:`v4l2_vp9_segmentation` for more details.
+    * - __u32
+      - ``flags``
+      - Combination of V4L2_VP9_FRAME_FLAG_* flags. See :ref:`Frame Flags<vp9_frame_flags>`.
+    * - __u16
+      - ``compressed_header_size``
+      - Compressed header size in bytes.
+    * - __u16
+      - ``uncompressed_header_size``
+      - Uncompressed header size in bytes.
+    * - __u16
+      - ``frame_width_minus_1``
+      - Add 1 to get the frame width expressed in pixels. See section 7.2.3 in :ref:`vp9`.
+    * - __u16
+      - ``frame_height_minus_1``
+      - Add 1 to get the frame height expressed in pixels. See section 7.2.3 in :ref:`vp9`.
+    * - __u16
+      - ``render_width_minus_1``
+      - Add 1 to get the expected render width expressed in pixels. This is
+        not used during the decoding process but might be used by HW scalers to
+        prepare a frame that's ready for scanout. See section 7.2.4 in :ref:`vp9`.
+    * - __u16
+      - render_height_minus_1
+      - Add 1 to get the expected render height expressed in pixels. This is
+        not used during the decoding process but might be used by HW scalers to
+        prepare a frame that's ready for scanout. See section 7.2.4 in :ref:`vp9`.
+    * - __u64
+      - ``last_frame_ts``
+      - "last" reference buffer timestamp.
+	The timestamp refers to the ``timestamp`` field in
+        struct :c:type:`v4l2_buffer`. Use the :c:func:`v4l2_timeval_to_ns()`
+        function to convert the struct :c:type:`timeval` in struct
+        :c:type:`v4l2_buffer` to a __u64.
+    * - __u64
+      - ``golden_frame_ts``
+      - "golden" reference buffer timestamp.
+	The timestamp refers to the ``timestamp`` field in
+        struct :c:type:`v4l2_buffer`. Use the :c:func:`v4l2_timeval_to_ns()`
+        function to convert the struct :c:type:`timeval` in struct
+        :c:type:`v4l2_buffer` to a __u64.
+    * - __u64
+      - ``alt_frame_ts``
+      - "alt" reference buffer timestamp.
+	The timestamp refers to the ``timestamp`` field in
+        struct :c:type:`v4l2_buffer`. Use the :c:func:`v4l2_timeval_to_ns()`
+        function to convert the struct :c:type:`timeval` in struct
+        :c:type:`v4l2_buffer` to a __u64.
+    * - __u8
+      - ``ref_frame_sign_bias``
+      - a bitfield specifying whether the sign bias is set for a given
+        reference frame. See :ref:`Reference Frame Sign Bias<vp9_ref_frame_sign_bias>`
+        for more details.
+    * - __u8
+      - ``reset_frame_context``
+      - specifies whether the frame context should be reset to default values. See
+        :ref:`Reset Frame Context<vp9_reset_frame_context>` for more details.
+    * - __u8
+      - ``frame_context_idx``
+      - Frame context that should be used/updated.
+    * - __u8
+      - ``profile``
+      - VP9 profile. Can be 0, 1, 2 or 3.
+    * - __u8
+      - ``bit_depth``
+      - Component depth in bits. Can be 8, 10 or 12. Note that not all profiles
+        support 10 and/or 12 bits depths.
+    * - __u8
+      - ``interpolation_filter``
+      - Specifies the filter selection used for performing inter prediction. See
+        :ref:`Interpolation Filter<vp9_interpolation_filter>` for more details.
+    * - __u8
+      - ``tile_cols_log2``
+      - Specifies the base 2 logarithm of the width of each tile (where the
+        width is measured in units of 8x8 blocks). Shall be less than or equal
+        to 6.
+    * - __u8
+      - ``tile_rows_log2``
+      - Specifies the base 2 logarithm of the height of each tile (where the
+        height is measured in units of 8x8 blocks).
+    * - __u8
+      - ``reference_mode``
+      - Specifies the type of inter prediction to be used. See
+        :ref:`Reference Mode<vp9_reference_mode>` for more details.
+    * - __u8
+      - ``reserved[7]``
+      - Applications and drivers must set this to zero.
+
+.. raw:: latex
+
+    \normalsize
+
+.. _vp9_frame_flags:
+
+``Frame Flags``
+
+.. tabularcolumns:: |p{10.0cm}|p{1.2cm}|p{6.1cm}|
+
+.. flat-table::
+    :header-rows:  0
+    :stub-columns: 0
+    :widths:       1 1 2
+
+    * - ``V4L2_VP9_FRAME_FLAG_KEY_FRAME``
+      - 0x001
+      - The frame is a key frame.
+    * - ``V4L2_VP9_FRAME_FLAG_SHOW_FRAME``
+      - 0x002
+      - The frame should be displayed.
+    * - ``V4L2_VP9_FRAME_FLAG_ERROR_RESILIENT``
+      - 0x004
+      - The decoding should be error resilient.
+    * - ``V4L2_VP9_FRAME_FLAG_INTRA_ONLY``
+      - 0x008
+      - The frame does not reference other frames.
+    * - ``V4L2_VP9_FRAME_FLAG_ALLOW_HIGH_PREC_MV``
+      - 0x010
+      - The frame can use high precision motion vectors.
+    * - ``V4L2_VP9_FRAME_FLAG_REFRESH_FRAME_CTX``
+      - 0x020
+      - Frame context should be updated after decoding.
+    * - ``V4L2_VP9_FRAME_FLAG_PARALLEL_DEC_MODE``
+      - 0x040
+      - Parallel decoding is used.
+    * - ``V4L2_VP9_FRAME_FLAG_X_SUBSAMPLING``
+      - 0x080
+      - Vertical subsampling is enabled.
+    * - ``V4L2_VP9_FRAME_FLAG_Y_SUBSAMPLING``
+      - 0x100
+      - Horizontal subsampling is enabled.
+    * - ``V4L2_VP9_FRAME_FLAG_COLOR_RANGE_FULL_SWING``
+      - 0x200
+      - The full UV range is used.
+
+.. _vp9_ref_frame_sign_bias:
+
+``Reference Frame Sign Bias``
+
+.. tabularcolumns:: |p{7.0cm}|p{1.2cm}|p{9.1cm}|
+
+.. flat-table::
+    :header-rows:  0
+    :stub-columns: 0
+    :widths:       1 1 2
+
+    * - ``V4L2_VP9_SIGN_BIAS_LAST``
+      - 0x1
+      - Sign bias is set for the last reference frame.
+    * - ``V4L2_VP9_SIGN_BIAS_GOLDEN``
+      - 0x2
+      - Sign bias is set for the golden reference frame.
+    * - ``V4L2_VP9_SIGN_BIAS_ALT``
+      - 0x2
+      - Sign bias is set for the alt reference frame.
+
+.. _vp9_reset_frame_context:
+
+``Reset Frame Context``
+
+.. tabularcolumns:: |p{7.0cm}|p{1.2cm}|p{9.1cm}|
+
+.. flat-table::
+    :header-rows:  0
+    :stub-columns: 0
+    :widths:       1 1 2
+
+    * - ``V4L2_VP9_RESET_FRAME_CTX_NONE``
+      - 0
+      - Do not reset any frame context.
+    * - ``V4L2_VP9_RESET_FRAME_CTX_SPEC``
+      - 1
+      - Reset the frame context pointed to by
+        :c:type:`v4l2_ctrl_vp9_frame`.frame_context_idx.
+    * - ``V4L2_VP9_RESET_FRAME_CTX_ALL``
+      - 2
+      - Reset all frame contexts.
+
+See section '7.2 Uncompressed header semantics' of the :ref:`vp9` specification
+for more details.
+
+.. _vp9_interpolation_filter:
+
+``Interpolation Filter``
+
+.. tabularcolumns:: |p{9.0cm}|p{1.2cm}|p{7.1cm}|
+
+.. flat-table::
+    :header-rows:  0
+    :stub-columns: 0
+    :widths:       1 1 2
+
+    * - ``V4L2_VP9_INTERP_FILTER_EIGHTTAP``
+      - 0
+      - Eight tap filter.
+    * - ``V4L2_VP9_INTERP_FILTER_EIGHTTAP_SMOOTH``
+      - 1
+      - Eight tap smooth filter.
+    * - ``V4L2_VP9_INTERP_FILTER_EIGHTTAP_SHARP``
+      - 2
+      - Eeight tap sharp filter.
+    * - ``V4L2_VP9_INTERP_FILTER_BILINEAR``
+      - 3
+      - Bilinear filter.
+    * - ``V4L2_VP9_INTERP_FILTER_SWITCHABLE``
+      - 4
+      - Filter selection is signaled at the block level.
+
+See section '7.2.7 Interpolation filter semantics' of the :ref:`vp9` specification
+for more details.
+
+.. _vp9_reference_mode:
+
+``Reference Mode``
+
+.. tabularcolumns:: |p{9.6cm}|p{0.5cm}|p{7.2cm}|
+
+.. flat-table::
+    :header-rows:  0
+    :stub-columns: 0
+    :widths:       1 1 2
+
+    * - ``V4L2_VP9_REFERENCE_MODE_SINGLE_REFERENCE``
+      - 0
+      - Indicates that all the inter blocks use only a single reference frame
+        to generate motion compensated prediction.
+    * - ``V4L2_VP9_REFERENCE_MODE_COMPOUND_REFERENCE``
+      - 1
+      - Requires all the inter blocks to use compound mode. Single reference
+        frame prediction is not allowed.
+    * - ``V4L2_VP9_REFERENCE_MODE_SELECT``
+      - 2
+      - Allows each individual inter block to select between single and
+        compound prediction modes.
+
+See section '7.3.6 Frame reference mode semantics' of the :ref:`vp9` specification for more details.
+
+.. c:type:: v4l2_vp9_segmentation
+
+Encodes the quantization parameters. See section '7.2.10 Segmentation
+params syntax' of the :ref:`vp9` specification for more details.
+
+.. tabularcolumns:: |p{0.8cm}|p{5cm}|p{11.4cm}|
+
+.. cssclass:: longtable
+
+.. flat-table:: struct v4l2_vp9_segmentation
+    :header-rows:  0
+    :stub-columns: 0
+    :widths:       1 1 2
+
+    * - __u8
+      - ``feature_data[8][4]``
+      - Data attached to each feature. Data entry is only valid if the feature
+        is enabled. The array shall be indexed with segment number as the first dimension
+        (0..7) and one of V4L2_VP9_SEG_* as the second dimension.
+        See :ref:`Segment Feature IDs<vp9_segment_feature>`.
+    * - __u8
+      - ``feature_enabled[8]``
+      - Bitmask defining which features are enabled in each segment. The value for each
+        segment is a combination of V4L2_VP9_SEGMENT_FEATURE_ENABLED(id) values where id is
+        one of V4L2_VP9_SEG_*. See :ref:`Segment Feature IDs<vp9_segment_feature>`.
+    * - __u8
+      - ``tree_probs[7]``
+      - Specifies the probability values to be used when decoding a Segment-ID.
+        See '5.15. Segmentation map' section of :ref:`vp9` for more details.
+    * - __u8
+      - ``pred_probs[3]``
+      - Specifies the probability values to be used when decoding a
+        Predicted-Segment-ID. See '6.4.14. Get segment id syntax'
+        section of :ref:`vp9` for more details.
+    * - __u8
+      - ``flags``
+      - Combination of V4L2_VP9_SEGMENTATION_FLAG_* flags. See
+        :ref:`Segmentation Flags<vp9_segmentation_flags>`.
+    * - __u8
+      - ``reserved[5]``
+      - Applications and drivers must set this to zero.
+
+.. _vp9_segment_feature:
+
+``Segment feature IDs``
+
+.. tabularcolumns:: |p{6.0cm}|p{1cm}|p{10.3cm}|
+
+.. flat-table::
+    :header-rows:  0
+    :stub-columns: 0
+    :widths:       1 1 2
+
+    * - ``V4L2_VP9_SEG_LVL_ALT_Q``
+      - 0
+      - Quantizer segment feature.
+    * - ``V4L2_VP9_SEG_LVL_ALT_L``
+      - 1
+      - Loop filter segment feature.
+    * - ``V4L2_VP9_SEG_LVL_REF_FRAME``
+      - 2
+      - Reference frame segment feature.
+    * - ``V4L2_VP9_SEG_LVL_SKIP``
+      - 3
+      - Skip segment feature.
+    * - ``V4L2_VP9_SEG_LVL_MAX``
+      - 4
+      - Number of segment features.
+
+.. _vp9_segmentation_flags:
+
+``Segmentation Flags``
+
+.. tabularcolumns:: |p{10.6cm}|p{0.8cm}|p{5.9cm}|
+
+.. flat-table::
+    :header-rows:  0
+    :stub-columns: 0
+    :widths:       1 1 2
+
+    * - ``V4L2_VP9_SEGMENTATION_FLAG_ENABLED``
+      - 0x01
+      - Indicates that this frame makes use of the segmentation tool.
+    * - ``V4L2_VP9_SEGMENTATION_FLAG_UPDATE_MAP``
+      - 0x02
+      - Indicates that the segmentation map should be updated during the
+        decoding of this frame.
+    * - ``V4L2_VP9_SEGMENTATION_FLAG_TEMPORAL_UPDATE``
+      - 0x04
+      - Indicates that the updates to the segmentation map are coded
+        relative to the existing segmentation map.
+    * - ``V4L2_VP9_SEGMENTATION_FLAG_UPDATE_DATA``
+      - 0x08
+      - Indicates that new parameters are about to be specified for each
+        segment.
+    * - ``V4L2_VP9_SEGMENTATION_FLAG_ABS_OR_DELTA_UPDATE``
+      - 0x10
+      - Indicates that the segmentation parameters represent the actual values
+        to be used.
+
+.. c:type:: v4l2_vp9_quantization
+
+Encodes the quantization parameters. See section '7.2.9 Quantization params
+syntax' of the VP9 specification for more details.
+
+.. tabularcolumns:: |p{0.8cm}|p{4cm}|p{12.4cm}|
+
+.. cssclass:: longtable
+
+.. flat-table:: struct v4l2_vp9_quantization
+    :header-rows:  0
+    :stub-columns: 0
+    :widths:       1 1 2
+
+    * - __u8
+      - ``base_q_idx``
+      - Indicates the base frame qindex.
+    * - __s8
+      - ``delta_q_y_dc``
+      - Indicates the Y DC quantizer relative to base_q_idx.
+    * - __s8
+      - ``delta_q_uv_dc``
+      - Indicates the UV DC quantizer relative to base_q_idx.
+    * - __s8
+      - ``delta_q_uv_ac``
+      - Indicates the UV AC quantizer relative to base_q_idx.
+    * - __u8
+      - ``reserved[4]``
+      - Applications and drivers must set this to zero.
+
+.. c:type:: v4l2_vp9_loop_filter
+
+This structure contains all loop filter related parameters. See sections
+'7.2.8 Loop filter semantics' of the :ref:`vp9` specification for more details.
+
+.. tabularcolumns:: |p{0.8cm}|p{4cm}|p{12.4cm}|
+
+.. cssclass:: longtable
+
+.. flat-table:: struct v4l2_vp9_loop_filter
+    :header-rows:  0
+    :stub-columns: 0
+    :widths:       1 1 2
+
+    * - __s8
+      - ``ref_deltas[4]``
+      - Contains the adjustment needed for the filter level based on the chosen
+        reference frame.
+    * - __s8
+      - ``mode_deltas[2]``
+      - Contains the adjustment needed for the filter level based on the chosen
+        mode.
+    * - __u8
+      - ``level``
+      - Indicates the loop filter strength.
+    * - __u8
+      - ``sharpness``
+      - Indicates the sharpness level.
+    * - __u8
+      - ``flags``
+      - Combination of V4L2_VP9_LOOP_FILTER_FLAG_* flags.
+        See :ref:`Loop Filter Flags <vp9_loop_filter_flags>`.
+    * - __u8
+      - ``reserved[7]``
+      - Applications and drivers must set this to zero.
+
+
+.. _vp9_loop_filter_flags:
+
+``Loop Filter Flags``
+
+.. tabularcolumns:: |p{9.6cm}|p{0.5cm}|p{7.2cm}|
+
+.. flat-table::
+    :header-rows:  0
+    :stub-columns: 0
+    :widths:       1 1 2
+
+    * - ``V4L2_VP9_LOOP_FILTER_FLAG_DELTA_ENABLED``
+      - 0x1
+      - When set, the filter level depends on the mode and reference frame used
+        to predict a block.
+    * - ``V4L2_VP9_LOOP_FILTER_FLAG_DELTA_UPDATE``
+      - 0x2
+      - When set, the bitstream contains additional syntax elements that
+        specify which mode and reference frame deltas are to be updated.
diff --git a/Documentation/userspace-api/media/v4l/pixfmt-compressed.rst b/Documentation/userspace-api/media/v4l/pixfmt-compressed.rst
index 0ede39907ee2..967fc803ef94 100644
--- a/Documentation/userspace-api/media/v4l/pixfmt-compressed.rst
+++ b/Documentation/userspace-api/media/v4l/pixfmt-compressed.rst
@@ -172,6 +172,21 @@ Compressed Formats
       - VP9 compressed video frame. The encoder generates one
 	compressed frame per buffer, and the decoder requires one
 	compressed frame per buffer.
+    * .. _V4L2-PIX-FMT-VP9-FRAME:
+
+      - ``V4L2_PIX_FMT_VP9_FRAME``
+      - 'VP9F'
+      - VP9 parsed frame, including the frame header, as extracted from the container.
+	This format is adapted for stateless video decoders that implement a
+	VP9 pipeline with the :ref:`stateless_decoder`.
+	Metadata associated with the frame to decode is required to be passed
+	through the ``V4L2_CID_STATELESS_VP9_FRAME`` and
+	the ``V4L2_CID_STATELESS_VP9_COMPRESSED_HDR`` controls.
+	See the :ref:`associated Codec Control IDs <v4l2-codec-stateless-vp9>`.
+	Exactly one output and one capture buffer must be provided for use with
+	this pixel format. The output buffer must contain the appropriate number
+	of macroblocks to decode a full corresponding frame to the matching
+	capture buffer.
     * .. _V4L2-PIX-FMT-HEVC:
 
       - ``V4L2_PIX_FMT_HEVC``
diff --git a/Documentation/userspace-api/media/v4l/vidioc-g-ext-ctrls.rst b/Documentation/userspace-api/media/v4l/vidioc-g-ext-ctrls.rst
index 2d6bc8d94380..d2bdd3db076f 100644
--- a/Documentation/userspace-api/media/v4l/vidioc-g-ext-ctrls.rst
+++ b/Documentation/userspace-api/media/v4l/vidioc-g-ext-ctrls.rst
@@ -233,6 +233,14 @@ still cause this situation.
       - ``p_mpeg2_quantisation``
       - A pointer to a struct :c:type:`v4l2_ctrl_mpeg2_quantisation`. Valid if this control is
         of type ``V4L2_CTRL_TYPE_MPEG2_QUANTISATION``.
+    * - struct :c:type:`v4l2_ctrl_vp9_compressed_hdr` *
+      - ``p_vp9_compressed_hdr_probs``
+      - A pointer to a struct :c:type:`v4l2_ctrl_vp9_compressed_hdr`. Valid if this
+        control is of type ``V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR``.
+    * - struct :c:type:`v4l2_ctrl_vp9_frame` *
+      - ``p_vp9_frame``
+      - A pointer to a struct :c:type:`v4l2_ctrl_vp9_frame`. Valid if this
+        control is of type ``V4L2_CTRL_TYPE_VP9_FRAME``.
     * - struct :c:type:`v4l2_ctrl_hdr10_cll_info` *
       - ``p_hdr10_cll``
       - A pointer to a struct :c:type:`v4l2_ctrl_hdr10_cll_info`. Valid if this control is
diff --git a/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst b/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
index f9ecf6276129..9ad930823960 100644
--- a/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
+++ b/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
@@ -507,6 +507,18 @@ See also the examples in :ref:`control`.
       - n/a
       - A struct :c:type:`v4l2_ctrl_hevc_decode_params`, containing HEVC
 	decoding parameters for stateless video decoders.
+    * - ``V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR``
+      - n/a
+      - n/a
+      - n/a
+      - A struct :c:type:`v4l2_ctrl_vp9_compressed_hdr`, containing VP9
+	probabilities updates for stateless video decoders.
+    * - ``V4L2_CTRL_TYPE_VP9_FRAME``
+      - n/a
+      - n/a
+      - n/a
+      - A struct :c:type:`v4l2_ctrl_vp9_frame`, containing VP9
+	frame decode parameters for stateless video decoders.
 
 .. raw:: latex
 
diff --git a/Documentation/userspace-api/media/videodev2.h.rst.exceptions b/Documentation/userspace-api/media/videodev2.h.rst.exceptions
index eb0b1cd37abd..9cbb7a0c354a 100644
--- a/Documentation/userspace-api/media/videodev2.h.rst.exceptions
+++ b/Documentation/userspace-api/media/videodev2.h.rst.exceptions
@@ -149,6 +149,8 @@ replace symbol V4L2_CTRL_TYPE_HEVC_SLICE_PARAMS :c:type:`v4l2_ctrl_type`
 replace symbol V4L2_CTRL_TYPE_AREA :c:type:`v4l2_ctrl_type`
 replace symbol V4L2_CTRL_TYPE_FWHT_PARAMS :c:type:`v4l2_ctrl_type`
 replace symbol V4L2_CTRL_TYPE_VP8_FRAME :c:type:`v4l2_ctrl_type`
+replace symbol V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR :c:type:`v4l2_ctrl_type`
+replace symbol V4L2_CTRL_TYPE_VP9_FRAME :c:type:`v4l2_ctrl_type`
 replace symbol V4L2_CTRL_TYPE_HDR10_CLL_INFO :c:type:`v4l2_ctrl_type`
 replace symbol V4L2_CTRL_TYPE_HDR10_MASTERING_DISPLAY :c:type:`v4l2_ctrl_type`
 
diff --git a/drivers/media/v4l2-core/v4l2-ctrls-core.c b/drivers/media/v4l2-core/v4l2-ctrls-core.c
index c4b5082849b6..52b9ff46ab26 100644
--- a/drivers/media/v4l2-core/v4l2-ctrls-core.c
+++ b/drivers/media/v4l2-core/v4l2-ctrls-core.c
@@ -283,6 +283,12 @@ static void std_log(const struct v4l2_ctrl *ctrl)
 	case V4L2_CTRL_TYPE_MPEG2_PICTURE:
 		pr_cont("MPEG2_PICTURE");
 		break;
+	case V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR:
+		pr_cont("VP9_COMPRESSED_HDR");
+		break;
+	case V4L2_CTRL_TYPE_VP9_FRAME:
+		pr_cont("VP9_FRAME");
+		break;
 	default:
 		pr_cont("unknown type %d", ctrl->type);
 		break;
@@ -317,6 +323,168 @@ static void std_log(const struct v4l2_ctrl *ctrl)
 #define zero_reserved(s) \
 	memset(&(s).reserved, 0, sizeof((s).reserved))
 
+static int
+validate_vp9_lf_params(struct v4l2_vp9_loop_filter *lf)
+{
+	unsigned int i;
+
+	if (lf->flags & ~(V4L2_VP9_LOOP_FILTER_FLAG_DELTA_ENABLED |
+			  V4L2_VP9_LOOP_FILTER_FLAG_DELTA_UPDATE))
+		return -EINVAL;
+
+	/* That all values are in the accepted range. */
+	if (lf->level > GENMASK(5, 0))
+		return -EINVAL;
+
+	if (lf->sharpness > GENMASK(2, 0))
+		return -EINVAL;
+
+	for (i = 0; i < ARRAY_SIZE(lf->ref_deltas); i++)
+		if (lf->ref_deltas[i] < -63 || lf->ref_deltas[i] > 63)
+			return -EINVAL;
+
+	for (i = 0; i < ARRAY_SIZE(lf->mode_deltas); i++)
+		if (lf->mode_deltas[i] < -63 || lf->mode_deltas[i] > 63)
+			return -EINVAL;
+
+	zero_reserved(*lf);
+	return 0;
+}
+
+static int
+validate_vp9_quant_params(struct v4l2_vp9_quantization *quant)
+{
+	if (quant->delta_q_y_dc < -15 || quant->delta_q_y_dc > 15 ||
+	    quant->delta_q_uv_dc < -15 || quant->delta_q_uv_dc > 15 ||
+	    quant->delta_q_uv_ac < -15 || quant->delta_q_uv_ac > 15)
+		return -EINVAL;
+
+	zero_reserved(*quant);
+	return 0;
+}
+
+static int
+validate_vp9_seg_params(struct v4l2_vp9_segmentation *seg)
+{
+	unsigned int i, j;
+
+	if (seg->flags & ~(V4L2_VP9_SEGMENTATION_FLAG_ENABLED |
+			   V4L2_VP9_SEGMENTATION_FLAG_UPDATE_MAP |
+			   V4L2_VP9_SEGMENTATION_FLAG_TEMPORAL_UPDATE |
+			   V4L2_VP9_SEGMENTATION_FLAG_UPDATE_DATA |
+			   V4L2_VP9_SEGMENTATION_FLAG_ABS_OR_DELTA_UPDATE))
+		return -EINVAL;
+
+	for (i = 0; i < ARRAY_SIZE(seg->feature_enabled); i++) {
+		if (seg->feature_enabled[i] &
+		    ~V4L2_VP9_SEGMENT_FEATURE_ENABLED_MASK)
+			return -EINVAL;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(seg->feature_data); i++) {
+		const int range[] = { 255, 63, 3, 0 };
+
+		for (j = 0; j < ARRAY_SIZE(seg->feature_data[j]); j++) {
+			if (seg->feature_data[i][j] < -range[j] ||
+			    seg->feature_data[i][j] > range[j])
+				return -EINVAL;
+		}
+	}
+
+	zero_reserved(*seg);
+	return 0;
+}
+
+static int
+validate_vp9_compressed_hdr(struct v4l2_ctrl_vp9_compressed_hdr *hdr)
+{
+	if (hdr->tx_mode > V4L2_VP9_TX_MODE_SELECT)
+		return -EINVAL;
+
+	return 0;
+}
+
+static int
+validate_vp9_frame(struct v4l2_ctrl_vp9_frame *frame)
+{
+	int ret;
+
+	/* Make sure we're not passed invalid flags. */
+	if (frame->flags & ~(V4L2_VP9_FRAME_FLAG_KEY_FRAME |
+		  V4L2_VP9_FRAME_FLAG_SHOW_FRAME |
+		  V4L2_VP9_FRAME_FLAG_ERROR_RESILIENT |
+		  V4L2_VP9_FRAME_FLAG_INTRA_ONLY |
+		  V4L2_VP9_FRAME_FLAG_ALLOW_HIGH_PREC_MV |
+		  V4L2_VP9_FRAME_FLAG_REFRESH_FRAME_CTX |
+		  V4L2_VP9_FRAME_FLAG_PARALLEL_DEC_MODE |
+		  V4L2_VP9_FRAME_FLAG_X_SUBSAMPLING |
+		  V4L2_VP9_FRAME_FLAG_Y_SUBSAMPLING |
+		  V4L2_VP9_FRAME_FLAG_COLOR_RANGE_FULL_SWING))
+		return -EINVAL;
+
+	if (frame->flags & V4L2_VP9_FRAME_FLAG_ERROR_RESILIENT &&
+	    frame->flags & V4L2_VP9_FRAME_FLAG_REFRESH_FRAME_CTX)
+		return -EINVAL;
+
+	if (frame->profile > V4L2_VP9_PROFILE_MAX)
+		return -EINVAL;
+
+	if (frame->reset_frame_context > V4L2_VP9_RESET_FRAME_CTX_ALL)
+		return -EINVAL;
+
+	if (frame->frame_context_idx >= V4L2_VP9_NUM_FRAME_CTX)
+		return -EINVAL;
+
+	/*
+	 * Profiles 0 and 1 only support 8-bit depth, profiles 2 and 3 only 10
+	 * and 12 bit depths.
+	 */
+	if ((frame->profile < 2 && frame->bit_depth != 8) ||
+	    (frame->profile >= 2 &&
+	     (frame->bit_depth != 10 && frame->bit_depth != 12)))
+		return -EINVAL;
+
+	/* Profile 0 and 2 only accept YUV 4:2:0. */
+	if ((frame->profile == 0 || frame->profile == 2) &&
+	    (!(frame->flags & V4L2_VP9_FRAME_FLAG_X_SUBSAMPLING) ||
+	     !(frame->flags & V4L2_VP9_FRAME_FLAG_Y_SUBSAMPLING)))
+		return -EINVAL;
+
+	/* Profile 1 and 3 only accept YUV 4:2:2, 4:4:0 and 4:4:4. */
+	if ((frame->profile == 1 || frame->profile == 3) &&
+	    ((frame->flags & V4L2_VP9_FRAME_FLAG_X_SUBSAMPLING) &&
+	     (frame->flags & V4L2_VP9_FRAME_FLAG_Y_SUBSAMPLING)))
+		return -EINVAL;
+
+	if (frame->interpolation_filter > V4L2_VP9_INTERP_FILTER_SWITCHABLE)
+		return -EINVAL;
+
+	/*
+	 * According to the spec, tile_cols_log2 shall be less than or equal
+	 * to 6.
+	 */
+	if (frame->tile_cols_log2 > 6)
+		return -EINVAL;
+
+	if (frame->reference_mode > V4L2_VP9_REFERENCE_MODE_SELECT)
+		return -EINVAL;
+
+	ret = validate_vp9_lf_params(&frame->lf);
+	if (ret)
+		return ret;
+
+	ret = validate_vp9_quant_params(&frame->quant);
+	if (ret)
+		return ret;
+
+	ret = validate_vp9_seg_params(&frame->seg);
+	if (ret)
+		return ret;
+
+	zero_reserved(*frame);
+	return 0;
+}
+
 /*
  * Compound controls validation requires setting unused fields/flags to zero
  * in order to properly detect unchanged controls with std_equal's memcmp.
@@ -687,6 +855,12 @@ static int std_validate_compound(const struct v4l2_ctrl *ctrl, u32 idx,
 
 		break;
 
+	case V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR:
+		return validate_vp9_compressed_hdr(p);
+
+	case V4L2_CTRL_TYPE_VP9_FRAME:
+		return validate_vp9_frame(p);
+
 	case V4L2_CTRL_TYPE_AREA:
 		area = p;
 		if (!area->width || !area->height)
@@ -1249,6 +1423,12 @@ static struct v4l2_ctrl *v4l2_ctrl_new(struct v4l2_ctrl_handler *hdl,
 	case V4L2_CTRL_TYPE_HDR10_MASTERING_DISPLAY:
 		elem_size = sizeof(struct v4l2_ctrl_hdr10_mastering_display);
 		break;
+	case V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR:
+		elem_size = sizeof(struct v4l2_ctrl_vp9_compressed_hdr);
+		break;
+	case V4L2_CTRL_TYPE_VP9_FRAME:
+		elem_size = sizeof(struct v4l2_ctrl_vp9_frame);
+		break;
 	case V4L2_CTRL_TYPE_AREA:
 		elem_size = sizeof(struct v4l2_area);
 		break;
diff --git a/drivers/media/v4l2-core/v4l2-ctrls-defs.c b/drivers/media/v4l2-core/v4l2-ctrls-defs.c
index 421300e13a41..5845c1b6bb2a 100644
--- a/drivers/media/v4l2-core/v4l2-ctrls-defs.c
+++ b/drivers/media/v4l2-core/v4l2-ctrls-defs.c
@@ -1175,6 +1175,8 @@ const char *v4l2_ctrl_get_name(u32 id)
 	case V4L2_CID_STATELESS_MPEG2_SEQUENCE:			return "MPEG-2 Sequence Header";
 	case V4L2_CID_STATELESS_MPEG2_PICTURE:			return "MPEG-2 Picture Header";
 	case V4L2_CID_STATELESS_MPEG2_QUANTISATION:		return "MPEG-2 Quantisation Matrices";
+	case V4L2_CID_STATELESS_VP9_COMPRESSED_HDR:	return "VP9 Probabilities Updates";
+	case V4L2_CID_STATELESS_VP9_FRAME:			return "VP9 Frame Decode Parameters";
 
 	/* Colorimetry controls */
 	/* Keep the order of the 'case's the same as in v4l2-controls.h! */
@@ -1493,6 +1495,12 @@ void v4l2_ctrl_fill(u32 id, const char **name, enum v4l2_ctrl_type *type,
 	case V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS:
 		*type = V4L2_CTRL_TYPE_HEVC_DECODE_PARAMS;
 		break;
+	case V4L2_CID_STATELESS_VP9_COMPRESSED_HDR:
+		*type = V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR;
+		break;
+	case V4L2_CID_STATELESS_VP9_FRAME:
+		*type = V4L2_CTRL_TYPE_VP9_FRAME;
+		break;
 	case V4L2_CID_UNIT_CELL_SIZE:
 		*type = V4L2_CTRL_TYPE_AREA;
 		*flags |= V4L2_CTRL_FLAG_READ_ONLY;
diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
index ec6fc1ef291e..7a5e8120d733 100644
--- a/drivers/media/v4l2-core/v4l2-ioctl.c
+++ b/drivers/media/v4l2-core/v4l2-ioctl.c
@@ -1394,6 +1394,7 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
 		case V4L2_PIX_FMT_VP8:		descr = "VP8"; break;
 		case V4L2_PIX_FMT_VP8_FRAME:    descr = "VP8 Frame"; break;
 		case V4L2_PIX_FMT_VP9:		descr = "VP9"; break;
+		case V4L2_PIX_FMT_VP9_FRAME:    descr = "VP9 Frame"; break;
 		case V4L2_PIX_FMT_HEVC:		descr = "HEVC"; break; /* aka H.265 */
 		case V4L2_PIX_FMT_HEVC_SLICE:	descr = "HEVC Parsed Slice Data"; break;
 		case V4L2_PIX_FMT_FWHT:		descr = "FWHT"; break; /* used in vicodec */
diff --git a/include/media/v4l2-ctrls.h b/include/media/v4l2-ctrls.h
index 575b59fbac77..b3ce438f1329 100644
--- a/include/media/v4l2-ctrls.h
+++ b/include/media/v4l2-ctrls.h
@@ -50,6 +50,8 @@ struct video_device;
  * @p_h264_decode_params:	Pointer to a struct v4l2_ctrl_h264_decode_params.
  * @p_h264_pred_weights:	Pointer to a struct v4l2_ctrl_h264_pred_weights.
  * @p_vp8_frame:		Pointer to a VP8 frame params structure.
+ * @p_vp9_compressed_hdr_probs:	Pointer to a VP9 frame compressed header probs structure.
+ * @p_vp9_frame:		Pointer to a VP9 frame params structure.
  * @p_hevc_sps:			Pointer to an HEVC sequence parameter set structure.
  * @p_hevc_pps:			Pointer to an HEVC picture parameter set structure.
  * @p_hevc_slice_params:	Pointer to an HEVC slice parameters structure.
@@ -80,6 +82,8 @@ union v4l2_ctrl_ptr {
 	struct v4l2_ctrl_hevc_sps *p_hevc_sps;
 	struct v4l2_ctrl_hevc_pps *p_hevc_pps;
 	struct v4l2_ctrl_hevc_slice_params *p_hevc_slice_params;
+	struct v4l2_ctrl_vp9_compressed_hdr *p_vp9_compressed_hdr_probs;
+	struct v4l2_ctrl_vp9_frame *p_vp9_frame;
 	struct v4l2_ctrl_hdr10_cll_info *p_hdr10_cll;
 	struct v4l2_ctrl_hdr10_mastering_display *p_hdr10_mastering;
 	struct v4l2_area *p_area;
diff --git a/include/uapi/linux/v4l2-controls.h b/include/uapi/linux/v4l2-controls.h
index 5532b5f68493..36c82ad98030 100644
--- a/include/uapi/linux/v4l2-controls.h
+++ b/include/uapi/linux/v4l2-controls.h
@@ -2010,6 +2010,290 @@ struct v4l2_ctrl_hdr10_mastering_display {
 	__u32 min_display_mastering_luminance;
 };
 
+/* Stateless VP9 controls */
+
+#define V4L2_VP9_LOOP_FILTER_FLAG_DELTA_ENABLED	0x1
+#define	V4L2_VP9_LOOP_FILTER_FLAG_DELTA_UPDATE	0x2
+
+/**
+ * struct v4l2_vp9_loop_filter - VP9 loop filter parameters
+ *
+ * @ref_deltas: contains the adjustment needed for the filter level based on the
+ * chosen reference frame. If this syntax element is not present in the bitstream,
+ * users should pass its last value.
+ * @mode_deltas: contains the adjustment needed for the filter level based on the
+ * chosen mode.	If this syntax element is not present in the bitstream, users should
+ * pass its last value.
+ * @level: indicates the loop filter strength.
+ * @sharpness: indicates the sharpness level.
+ * @flags: combination of V4L2_VP9_LOOP_FILTER_FLAG_{} flags.
+ * @reserved: padding field. Should be zeroed by applications.
+ *
+ * This structure contains all loop filter related parameters. See sections
+ * '7.2.8 Loop filter semantics' of the VP9 specification for more details.
+ */
+struct v4l2_vp9_loop_filter {
+	__s8 ref_deltas[4];
+	__s8 mode_deltas[2];
+	__u8 level;
+	__u8 sharpness;
+	__u8 flags;
+	__u8 reserved[7];
+};
+
+/**
+ * struct v4l2_vp9_quantization - VP9 quantization parameters
+ *
+ * @base_q_idx: indicates the base frame qindex.
+ * @delta_q_y_dc: indicates the Y DC quantizer relative to base_q_idx.
+ * @delta_q_uv_dc: indicates the UV DC quantizer relative to base_q_idx.
+ * @delta_q_uv_ac: indicates the UV AC quantizer relative to base_q_idx.
+ * @reserved: padding field. Should be zeroed by applications.
+ *
+ * Encodes the quantization parameters. See section '7.2.9 Quantization params
+ * syntax' of the VP9 specification for more details.
+ */
+struct v4l2_vp9_quantization {
+	__u8 base_q_idx;
+	__s8 delta_q_y_dc;
+	__s8 delta_q_uv_dc;
+	__s8 delta_q_uv_ac;
+	__u8 reserved[4];
+};
+
+#define V4L2_VP9_SEGMENTATION_FLAG_ENABLED		0x01
+#define V4L2_VP9_SEGMENTATION_FLAG_UPDATE_MAP		0x02
+#define V4L2_VP9_SEGMENTATION_FLAG_TEMPORAL_UPDATE	0x04
+#define V4L2_VP9_SEGMENTATION_FLAG_UPDATE_DATA		0x08
+#define V4L2_VP9_SEGMENTATION_FLAG_ABS_OR_DELTA_UPDATE	0x10
+
+#define V4L2_VP9_SEG_LVL_ALT_Q				0
+#define V4L2_VP9_SEG_LVL_ALT_L				1
+#define V4L2_VP9_SEG_LVL_REF_FRAME			2
+#define V4L2_VP9_SEG_LVL_SKIP				3
+#define V4L2_VP9_SEG_LVL_MAX				4
+
+#define V4L2_VP9_SEGMENT_FEATURE_ENABLED(id)	(1 << (id))
+#define V4L2_VP9_SEGMENT_FEATURE_ENABLED_MASK	0xf
+
+/**
+ * struct v4l2_vp9_segmentation - VP9 segmentation parameters
+ *
+ * @feature_data: data attached to each feature. Data entry is only valid if
+ * the feature is enabled. The array shall be indexed with segment number as
+ * the first dimension (0..7) and one of V4L2_VP9_SEG_{} as the second dimension.
+ * @feature_enabled: bitmask defining which features are enabled in each segment.
+ * The value for each segment is a combination of V4L2_VP9_SEGMENT_FEATURE_ENABLED(id)
+ * values where id is one of V4L2_VP9_SEG_LVL_{}.
+ * @tree_probs: specifies the probability values to be used when decoding a
+ * Segment-ID. See '5.15. Segmentation map' section of the VP9 specification
+ * for more details.
+ * @pred_probs: specifies the probability values to be used when decoding a
+ * Predicted-Segment-ID. See '6.4.14. Get segment id syntax' section of :ref:`vp9`
+ * for more details.
+ * @flags: combination of V4L2_VP9_SEGMENTATION_FLAG_{} flags.
+ * @reserved: padding field. Should be zeroed by applications.
+ *
+ * Encodes the quantization parameters. See section '7.2.10 Segmentation params syntax' of
+ * the VP9 specification for more details.
+ */
+struct v4l2_vp9_segmentation {
+	__s16 feature_data[8][4];
+	__u8 feature_enabled[8];
+	__u8 tree_probs[7];
+	__u8 pred_probs[3];
+	__u8 flags;
+	__u8 reserved[5];
+};
+
+#define V4L2_VP9_FRAME_FLAG_KEY_FRAME			0x001
+#define V4L2_VP9_FRAME_FLAG_SHOW_FRAME			0x002
+#define V4L2_VP9_FRAME_FLAG_ERROR_RESILIENT		0x004
+#define V4L2_VP9_FRAME_FLAG_INTRA_ONLY			0x008
+#define V4L2_VP9_FRAME_FLAG_ALLOW_HIGH_PREC_MV		0x010
+#define V4L2_VP9_FRAME_FLAG_REFRESH_FRAME_CTX		0x020
+#define V4L2_VP9_FRAME_FLAG_PARALLEL_DEC_MODE		0x040
+#define V4L2_VP9_FRAME_FLAG_X_SUBSAMPLING		0x080
+#define V4L2_VP9_FRAME_FLAG_Y_SUBSAMPLING		0x100
+#define V4L2_VP9_FRAME_FLAG_COLOR_RANGE_FULL_SWING	0x200
+
+#define V4L2_VP9_SIGN_BIAS_LAST				0x1
+#define V4L2_VP9_SIGN_BIAS_GOLDEN			0x2
+#define V4L2_VP9_SIGN_BIAS_ALT				0x4
+
+#define V4L2_VP9_RESET_FRAME_CTX_NONE			0
+#define V4L2_VP9_RESET_FRAME_CTX_SPEC			1
+#define V4L2_VP9_RESET_FRAME_CTX_ALL			2
+
+#define V4L2_VP9_INTERP_FILTER_EIGHTTAP			0
+#define V4L2_VP9_INTERP_FILTER_EIGHTTAP_SMOOTH		1
+#define V4L2_VP9_INTERP_FILTER_EIGHTTAP_SHARP		2
+#define V4L2_VP9_INTERP_FILTER_BILINEAR			3
+#define V4L2_VP9_INTERP_FILTER_SWITCHABLE		4
+
+#define V4L2_VP9_REFERENCE_MODE_SINGLE_REFERENCE	0
+#define V4L2_VP9_REFERENCE_MODE_COMPOUND_REFERENCE	1
+#define V4L2_VP9_REFERENCE_MODE_SELECT			2
+
+#define V4L2_VP9_PROFILE_MAX				3
+
+#define V4L2_CID_STATELESS_VP9_FRAME	(V4L2_CID_CODEC_STATELESS_BASE + 300)
+/**
+ * struct v4l2_ctrl_vp9_frame - VP9 frame decoding control
+ *
+ * @lf: loop filter parameters. See &v4l2_vp9_loop_filter for more details.
+ * @quant: quantization parameters. See &v4l2_vp9_quantization for more details.
+ * @seg: segmentation parameters. See &v4l2_vp9_segmentation for more details.
+ * @flags: combination of V4L2_VP9_FRAME_FLAG_{} flags.
+ * @compressed_header_size: compressed header size in bytes.
+ * @uncompressed_header_size: uncompressed header size in bytes.
+ * @frame_width_minus_1: add 1 to it and you'll get the frame width expressed in pixels.
+ * @frame_height_minus_1: add 1 to it and you'll get the frame height expressed in pixels.
+ * @render_width_minus_1: add 1 to it and you'll get the expected render width expressed in
+ * pixels. This is not used during the decoding process but might be used by HW scalers
+ * to prepare a frame that's ready for scanout.
+ * @render_height_minus_1: add 1 to it and you'll get the expected render height expressed in
+ * pixels. This is not used during the decoding process but might be used by HW scalers
+ * to prepare a frame that's ready for scanout.
+ * @last_frame_ts: "last" reference buffer timestamp.
+ * The timestamp refers to the timestamp field in struct v4l2_buffer.
+ * Use v4l2_timeval_to_ns() to convert the struct timeval to a __u64.
+ * @golden_frame_ts: "golden" reference buffer timestamp.
+ * The timestamp refers to the timestamp field in struct v4l2_buffer.
+ * Use v4l2_timeval_to_ns() to convert the struct timeval to a __u64.
+ * @alt_frame_ts: "alt" reference buffer timestamp.
+ * The timestamp refers to the timestamp field in struct v4l2_buffer.
+ * Use v4l2_timeval_to_ns() to convert the struct timeval to a __u64.
+ * @ref_frame_sign_bias: a bitfield specifying whether the sign bias is set for a given
+ * reference frame. Either of V4L2_VP9_SIGN_BIAS_{}.
+ * @reset_frame_context: specifies whether the frame context should be reset to default values.
+ * Either of V4L2_VP9_RESET_FRAME_CTX_{}.
+ * @frame_context_idx: frame context that should be used/updated.
+ * @profile: VP9 profile. Can be 0, 1, 2 or 3.
+ * @bit_depth: bits per components. Can be 8, 10 or 12. Note that not all profiles support
+ * 10 and/or 12 bits depths.
+ * @interpolation_filter: specifies the filter selection used for performing inter prediction.
+ * Set to one of V4L2_VP9_INTERP_FILTER_{}.
+ * @tile_cols_log2: specifies the base 2 logarithm of the width of each tile (where the width
+ * is measured in units of 8x8 blocks). Shall be less than or equal to 6.
+ * @tile_rows_log2: specifies the base 2 logarithm of the height of each tile (where the height
+ * is measured in units of 8x8 blocks).
+ * @reference_mode: specifies the type of inter prediction to be used.
+ * Set to one of V4L2_VP9_REFERENCE_MODE_{}.
+ * @reserved: padding field. Should be zeroed by applications.
+ */
+struct v4l2_ctrl_vp9_frame {
+	struct v4l2_vp9_loop_filter lf;
+	struct v4l2_vp9_quantization quant;
+	struct v4l2_vp9_segmentation seg;
+	__u32 flags;
+	__u16 compressed_header_size;
+	__u16 uncompressed_header_size;
+	__u16 frame_width_minus_1;
+	__u16 frame_height_minus_1;
+	__u16 render_width_minus_1;
+	__u16 render_height_minus_1;
+	__u64 last_frame_ts;
+	__u64 golden_frame_ts;
+	__u64 alt_frame_ts;
+	__u8 ref_frame_sign_bias;
+	__u8 reset_frame_context;
+	__u8 frame_context_idx;
+	__u8 profile;
+	__u8 bit_depth;
+	__u8 interpolation_filter;
+	__u8 tile_cols_log2;
+	__u8 tile_rows_log2;
+	__u8 reference_mode;
+	__u8 reserved[7];
+};
+
+#define V4L2_VP9_NUM_FRAME_CTX	4
+
+/**
+ * struct v4l2_vp9_mv_probs - VP9 Motion vector probability updates
+ * @joint: motion vector joint probability updates.
+ * @sign: motion vector sign probability updates.
+ * @classes: motion vector class probability updates.
+ * @class0_bit: motion vector class0 bit probability updates.
+ * @bits: motion vector bits probability updates.
+ * @class0_fr: motion vector class0 fractional bit probability updates.
+ * @fr: motion vector fractional bit probability updates.
+ * @class0_hp: motion vector class0 high precision fractional bit probability updates.
+ * @hp: motion vector high precision fractional bit probability updates.
+ *
+ * This structure contains new values of motion vector probabilities.
+ * A value of zero in an array element means there is no update of the relevant probability.
+ * See `struct v4l2_vp9_prob_updates` for details.
+ */
+struct v4l2_vp9_mv_probs {
+	__u8 joint[3];
+	__u8 sign[2];
+	__u8 classes[2][10];
+	__u8 class0_bit[2];
+	__u8 bits[2][10];
+	__u8 class0_fr[2][2][3];
+	__u8 fr[2][3];
+	__u8 class0_hp[2];
+	__u8 hp[2];
+};
+
+#define V4L2_CID_STATELESS_VP9_COMPRESSED_HDR	(V4L2_CID_CODEC_STATELESS_BASE + 301)
+
+#define V4L2_VP9_TX_MODE_ONLY_4X4			0
+#define V4L2_VP9_TX_MODE_ALLOW_8X8			1
+#define V4L2_VP9_TX_MODE_ALLOW_16X16			2
+#define V4L2_VP9_TX_MODE_ALLOW_32X32			3
+#define V4L2_VP9_TX_MODE_SELECT				4
+
+/**
+ * struct v4l2_ctrl_vp9_compressed_hdr - VP9 probability updates control
+ * @tx_mode: specifies the TX mode. Set to one of V4L2_VP9_TX_MODE_{}.
+ * @tx8: TX 8x8 probability updates.
+ * @tx16: TX 16x16 probability updates.
+ * @tx32: TX 32x32 probability updates.
+ * @coef: coefficient probability updates.
+ * @skip: skip probability updates.
+ * @inter_mode: inter mode probability updates.
+ * @interp_filter: interpolation filter probability updates.
+ * @is_inter: is inter-block probability updates.
+ * @comp_mode: compound prediction mode probability updates.
+ * @single_ref: single ref probability updates.
+ * @comp_ref: compound ref probability updates.
+ * @y_mode: Y prediction mode probability updates.
+ * @uv_mode: UV prediction mode probability updates.
+ * @partition: partition probability updates.
+ * @mv: motion vector probability updates.
+ *
+ * This structure holds the probabilities update as parsed in the compressed
+ * header (Spec 6.3). These values represent the value of probability update after
+ * being translated with inv_map_table[] (see 6.3.5). A value of zero in an array element
+ * means that there is no update of the relevant probability.
+ *
+ * This control is optional and needs to be used when dealing with the hardware which is
+ * not capable of parsing the compressed header itself. Only drivers which need it will
+ * implement it.
+ */
+struct v4l2_ctrl_vp9_compressed_hdr {
+	__u8 tx_mode;
+	__u8 tx8[2][1];
+	__u8 tx16[2][2];
+	__u8 tx32[2][3];
+	__u8 coef[4][2][2][6][6][3];
+	__u8 skip[3];
+	__u8 inter_mode[7][3];
+	__u8 interp_filter[4][2];
+	__u8 is_inter[4];
+	__u8 comp_mode[5];
+	__u8 single_ref[5][2];
+	__u8 comp_ref[5];
+	__u8 y_mode[4][9];
+	__u8 uv_mode[10][9];
+	__u8 partition[16][3];
+
+	struct v4l2_vp9_mv_probs mv;
+};
+
 /* MPEG-compression definitions kept for backwards compatibility */
 #ifndef __KERNEL__
 #define V4L2_CTRL_CLASS_MPEG            V4L2_CTRL_CLASS_CODEC
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 58392dcd3bf5..2cd8f7e432c5 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -703,6 +703,7 @@ struct v4l2_pix_format {
 #define V4L2_PIX_FMT_VP8      v4l2_fourcc('V', 'P', '8', '0') /* VP8 */
 #define V4L2_PIX_FMT_VP8_FRAME v4l2_fourcc('V', 'P', '8', 'F') /* VP8 parsed frame */
 #define V4L2_PIX_FMT_VP9      v4l2_fourcc('V', 'P', '9', '0') /* VP9 */
+#define V4L2_PIX_FMT_VP9_FRAME v4l2_fourcc('V', 'P', '9', 'F') /* VP9 parsed frame */
 #define V4L2_PIX_FMT_HEVC     v4l2_fourcc('H', 'E', 'V', 'C') /* HEVC aka H.265 */
 #define V4L2_PIX_FMT_FWHT     v4l2_fourcc('F', 'W', 'H', 'T') /* Fast Walsh Hadamard Transform (vicodec) */
 #define V4L2_PIX_FMT_FWHT_STATELESS     v4l2_fourcc('S', 'F', 'W', 'H') /* Stateless FWHT (vicodec) */
@@ -1755,6 +1756,8 @@ struct v4l2_ext_control {
 		struct v4l2_ctrl_mpeg2_sequence __user *p_mpeg2_sequence;
 		struct v4l2_ctrl_mpeg2_picture __user *p_mpeg2_picture;
 		struct v4l2_ctrl_mpeg2_quantisation __user *p_mpeg2_quantisation;
+		struct v4l2_ctrl_vp9_compressed_hdr __user *p_vp9_compressed_hdr_probs;
+		struct v4l2_ctrl_vp9_frame __user *p_vp9_frame;
 		void __user *ptr;
 	};
 } __attribute__ ((packed));
@@ -1819,6 +1822,9 @@ enum v4l2_ctrl_type {
 	V4L2_CTRL_TYPE_MPEG2_QUANTISATION   = 0x0250,
 	V4L2_CTRL_TYPE_MPEG2_SEQUENCE       = 0x0251,
 	V4L2_CTRL_TYPE_MPEG2_PICTURE        = 0x0252,
+
+	V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR	= 0x0260,
+	V4L2_CTRL_TYPE_VP9_FRAME		= 0x0261,
 };
 
 /*  Used in the VIDIOC_QUERYCTRL ioctl for querying controls */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v7 06/11] media: Add VP9 v4l2 library
  2021-09-29 16:04 [PATCH v7 00/11] VP9 codec V4L2 control interface Andrzej Pietrasiewicz
                   ` (4 preceding siblings ...)
  2021-09-29 16:04 ` [PATCH v7 05/11] media: uapi: Add VP9 stateless decoder controls Andrzej Pietrasiewicz
@ 2021-09-29 16:04 ` Andrzej Pietrasiewicz
  2021-09-29 16:04 ` [PATCH v7 07/11] media: rkvdec: Add the VP9 backend Andrzej Pietrasiewicz
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 37+ messages in thread
From: Andrzej Pietrasiewicz @ 2021-09-29 16:04 UTC (permalink / raw)
  To: linux-media, linux-arm-kernel, linux-kernel, linux-rockchip,
	linux-staging
  Cc: Andrzej Pietrasiewicz, Benjamin Gaignard, Boris Brezillon,
	Ezequiel Garcia, Fabio Estevam, Greg Kroah-Hartman, Hans Verkuil,
	Heiko Stuebner, Jernej Skrabec, Mauro Carvalho Chehab,
	Nicolas Dufresne, NXP Linux Team, Pengutronix Kernel Team,
	Philipp Zabel, Sascha Hauer, Shawn Guo, kernel, Ezequiel Garcia

Provide code common to vp9 drivers in one central location.

Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
---
 drivers/media/v4l2-core/Kconfig    |    4 +
 drivers/media/v4l2-core/Makefile   |    1 +
 drivers/media/v4l2-core/v4l2-vp9.c | 1850 ++++++++++++++++++++++++++++
 include/media/v4l2-vp9.h           |  182 +++
 4 files changed, 2037 insertions(+)
 create mode 100644 drivers/media/v4l2-core/v4l2-vp9.c
 create mode 100644 include/media/v4l2-vp9.h

diff --git a/drivers/media/v4l2-core/Kconfig b/drivers/media/v4l2-core/Kconfig
index 02dc1787e953..6ee75c6c820e 100644
--- a/drivers/media/v4l2-core/Kconfig
+++ b/drivers/media/v4l2-core/Kconfig
@@ -52,6 +52,10 @@ config V4L2_JPEG_HELPER
 config V4L2_H264
 	tristate
 
+# Used by drivers that need v4l2-vp9.ko
+config V4L2_VP9
+	tristate
+
 # Used by drivers that need v4l2-mem2mem.ko
 config V4L2_MEM2MEM_DEV
 	tristate
diff --git a/drivers/media/v4l2-core/Makefile b/drivers/media/v4l2-core/Makefile
index 66a78c556c98..83fac5c746f5 100644
--- a/drivers/media/v4l2-core/Makefile
+++ b/drivers/media/v4l2-core/Makefile
@@ -24,6 +24,7 @@ obj-$(CONFIG_VIDEO_TUNER) += tuner.o
 
 obj-$(CONFIG_V4L2_MEM2MEM_DEV) += v4l2-mem2mem.o
 obj-$(CONFIG_V4L2_H264) += v4l2-h264.o
+obj-$(CONFIG_V4L2_VP9) += v4l2-vp9.o
 
 obj-$(CONFIG_V4L2_FLASH_LED_CLASS) += v4l2-flash-led-class.o
 
diff --git a/drivers/media/v4l2-core/v4l2-vp9.c b/drivers/media/v4l2-core/v4l2-vp9.c
new file mode 100644
index 000000000000..859589f1fd35
--- /dev/null
+++ b/drivers/media/v4l2-core/v4l2-vp9.c
@@ -0,0 +1,1850 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * V4L2 VP9 helpers.
+ *
+ * Copyright (C) 2021 Collabora, Ltd.
+ *
+ * Author: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
+ */
+
+#include <linux/module.h>
+
+#include <media/v4l2-vp9.h>
+
+const u8 v4l2_vp9_kf_y_mode_prob[10][10][9] = {
+	{
+		/* above = dc */
+		{ 137,  30,  42, 148, 151, 207,  70,  52,  91 }, /*left = dc  */
+		{  92,  45, 102, 136, 116, 180,  74,  90, 100 }, /*left = v   */
+		{  73,  32,  19, 187, 222, 215,  46,  34, 100 }, /*left = h   */
+		{  91,  30,  32, 116, 121, 186,  93,  86,  94 }, /*left = d45 */
+		{  72,  35,  36, 149,  68, 206,  68,  63, 105 }, /*left = d135*/
+		{  73,  31,  28, 138,  57, 124,  55, 122, 151 }, /*left = d117*/
+		{  67,  23,  21, 140, 126, 197,  40,  37, 171 }, /*left = d153*/
+		{  86,  27,  28, 128, 154, 212,  45,  43,  53 }, /*left = d207*/
+		{  74,  32,  27, 107,  86, 160,  63, 134, 102 }, /*left = d63 */
+		{  59,  67,  44, 140, 161, 202,  78,  67, 119 }, /*left = tm  */
+	}, {  /* above = v */
+		{  63,  36, 126, 146, 123, 158,  60,  90,  96 }, /*left = dc  */
+		{  43,  46, 168, 134, 107, 128,  69, 142,  92 }, /*left = v   */
+		{  44,  29,  68, 159, 201, 177,  50,  57,  77 }, /*left = h   */
+		{  58,  38,  76, 114,  97, 172,  78, 133,  92 }, /*left = d45 */
+		{  46,  41,  76, 140,  63, 184,  69, 112,  57 }, /*left = d135*/
+		{  38,  32,  85, 140,  46, 112,  54, 151, 133 }, /*left = d117*/
+		{  39,  27,  61, 131, 110, 175,  44,  75, 136 }, /*left = d153*/
+		{  52,  30,  74, 113, 130, 175,  51,  64,  58 }, /*left = d207*/
+		{  47,  35,  80, 100,  74, 143,  64, 163,  74 }, /*left = d63 */
+		{  36,  61, 116, 114, 128, 162,  80, 125,  82 }, /*left = tm  */
+	}, {  /* above = h */
+		{  82,  26,  26, 171, 208, 204,  44,  32, 105 }, /*left = dc  */
+		{  55,  44,  68, 166, 179, 192,  57,  57, 108 }, /*left = v   */
+		{  42,  26,  11, 199, 241, 228,  23,  15,  85 }, /*left = h   */
+		{  68,  42,  19, 131, 160, 199,  55,  52,  83 }, /*left = d45 */
+		{  58,  50,  25, 139, 115, 232,  39,  52, 118 }, /*left = d135*/
+		{  50,  35,  33, 153, 104, 162,  64,  59, 131 }, /*left = d117*/
+		{  44,  24,  16, 150, 177, 202,  33,  19, 156 }, /*left = d153*/
+		{  55,  27,  12, 153, 203, 218,  26,  27,  49 }, /*left = d207*/
+		{  53,  49,  21, 110, 116, 168,  59,  80,  76 }, /*left = d63 */
+		{  38,  72,  19, 168, 203, 212,  50,  50, 107 }, /*left = tm  */
+	}, {  /* above = d45 */
+		{ 103,  26,  36, 129, 132, 201,  83,  80,  93 }, /*left = dc  */
+		{  59,  38,  83, 112, 103, 162,  98, 136,  90 }, /*left = v   */
+		{  62,  30,  23, 158, 200, 207,  59,  57,  50 }, /*left = h   */
+		{  67,  30,  29,  84,  86, 191, 102,  91,  59 }, /*left = d45 */
+		{  60,  32,  33, 112,  71, 220,  64,  89, 104 }, /*left = d135*/
+		{  53,  26,  34, 130,  56, 149,  84, 120, 103 }, /*left = d117*/
+		{  53,  21,  23, 133, 109, 210,  56,  77, 172 }, /*left = d153*/
+		{  77,  19,  29, 112, 142, 228,  55,  66,  36 }, /*left = d207*/
+		{  61,  29,  29,  93,  97, 165,  83, 175, 162 }, /*left = d63 */
+		{  47,  47,  43, 114, 137, 181, 100,  99,  95 }, /*left = tm  */
+	}, {  /* above = d135 */
+		{  69,  23,  29, 128,  83, 199,  46,  44, 101 }, /*left = dc  */
+		{  53,  40,  55, 139,  69, 183,  61,  80, 110 }, /*left = v   */
+		{  40,  29,  19, 161, 180, 207,  43,  24,  91 }, /*left = h   */
+		{  60,  34,  19, 105,  61, 198,  53,  64,  89 }, /*left = d45 */
+		{  52,  31,  22, 158,  40, 209,  58,  62,  89 }, /*left = d135*/
+		{  44,  31,  29, 147,  46, 158,  56, 102, 198 }, /*left = d117*/
+		{  35,  19,  12, 135,  87, 209,  41,  45, 167 }, /*left = d153*/
+		{  55,  25,  21, 118,  95, 215,  38,  39,  66 }, /*left = d207*/
+		{  51,  38,  25, 113,  58, 164,  70,  93,  97 }, /*left = d63 */
+		{  47,  54,  34, 146, 108, 203,  72, 103, 151 }, /*left = tm  */
+	}, {  /* above = d117 */
+		{  64,  19,  37, 156,  66, 138,  49,  95, 133 }, /*left = dc  */
+		{  46,  27,  80, 150,  55, 124,  55, 121, 135 }, /*left = v   */
+		{  36,  23,  27, 165, 149, 166,  54,  64, 118 }, /*left = h   */
+		{  53,  21,  36, 131,  63, 163,  60, 109,  81 }, /*left = d45 */
+		{  40,  26,  35, 154,  40, 185,  51,  97, 123 }, /*left = d135*/
+		{  35,  19,  34, 179,  19,  97,  48, 129, 124 }, /*left = d117*/
+		{  36,  20,  26, 136,  62, 164,  33,  77, 154 }, /*left = d153*/
+		{  45,  18,  32, 130,  90, 157,  40,  79,  91 }, /*left = d207*/
+		{  45,  26,  28, 129,  45, 129,  49, 147, 123 }, /*left = d63 */
+		{  38,  44,  51, 136,  74, 162,  57,  97, 121 }, /*left = tm  */
+	}, {  /* above = d153 */
+		{  75,  17,  22, 136, 138, 185,  32,  34, 166 }, /*left = dc  */
+		{  56,  39,  58, 133, 117, 173,  48,  53, 187 }, /*left = v   */
+		{  35,  21,  12, 161, 212, 207,  20,  23, 145 }, /*left = h   */
+		{  56,  29,  19, 117, 109, 181,  55,  68, 112 }, /*left = d45 */
+		{  47,  29,  17, 153,  64, 220,  59,  51, 114 }, /*left = d135*/
+		{  46,  16,  24, 136,  76, 147,  41,  64, 172 }, /*left = d117*/
+		{  34,  17,  11, 108, 152, 187,  13,  15, 209 }, /*left = d153*/
+		{  51,  24,  14, 115, 133, 209,  32,  26, 104 }, /*left = d207*/
+		{  55,  30,  18, 122,  79, 179,  44,  88, 116 }, /*left = d63 */
+		{  37,  49,  25, 129, 168, 164,  41,  54, 148 }, /*left = tm  */
+	}, {  /* above = d207 */
+		{  82,  22,  32, 127, 143, 213,  39,  41,  70 }, /*left = dc  */
+		{  62,  44,  61, 123, 105, 189,  48,  57,  64 }, /*left = v   */
+		{  47,  25,  17, 175, 222, 220,  24,  30,  86 }, /*left = h   */
+		{  68,  36,  17, 106, 102, 206,  59,  74,  74 }, /*left = d45 */
+		{  57,  39,  23, 151,  68, 216,  55,  63,  58 }, /*left = d135*/
+		{  49,  30,  35, 141,  70, 168,  82,  40, 115 }, /*left = d117*/
+		{  51,  25,  15, 136, 129, 202,  38,  35, 139 }, /*left = d153*/
+		{  68,  26,  16, 111, 141, 215,  29,  28,  28 }, /*left = d207*/
+		{  59,  39,  19, 114,  75, 180,  77, 104,  42 }, /*left = d63 */
+		{  40,  61,  26, 126, 152, 206,  61,  59,  93 }, /*left = tm  */
+	}, {  /* above = d63 */
+		{  78,  23,  39, 111, 117, 170,  74, 124,  94 }, /*left = dc  */
+		{  48,  34,  86, 101,  92, 146,  78, 179, 134 }, /*left = v   */
+		{  47,  22,  24, 138, 187, 178,  68,  69,  59 }, /*left = h   */
+		{  56,  25,  33, 105, 112, 187,  95, 177, 129 }, /*left = d45 */
+		{  48,  31,  27, 114,  63, 183,  82, 116,  56 }, /*left = d135*/
+		{  43,  28,  37, 121,  63, 123,  61, 192, 169 }, /*left = d117*/
+		{  42,  17,  24, 109,  97, 177,  56,  76, 122 }, /*left = d153*/
+		{  58,  18,  28, 105, 139, 182,  70,  92,  63 }, /*left = d207*/
+		{  46,  23,  32,  74,  86, 150,  67, 183,  88 }, /*left = d63 */
+		{  36,  38,  48,  92, 122, 165,  88, 137,  91 }, /*left = tm  */
+	}, {  /* above = tm */
+		{  65,  70,  60, 155, 159, 199,  61,  60,  81 }, /*left = dc  */
+		{  44,  78, 115, 132, 119, 173,  71, 112,  93 }, /*left = v   */
+		{  39,  38,  21, 184, 227, 206,  42,  32,  64 }, /*left = h   */
+		{  58,  47,  36, 124, 137, 193,  80,  82,  78 }, /*left = d45 */
+		{  49,  50,  35, 144,  95, 205,  63,  78,  59 }, /*left = d135*/
+		{  41,  53,  52, 148,  71, 142,  65, 128,  51 }, /*left = d117*/
+		{  40,  36,  28, 143, 143, 202,  40,  55, 137 }, /*left = d153*/
+		{  52,  34,  29, 129, 183, 227,  42,  35,  43 }, /*left = d207*/
+		{  42,  44,  44, 104, 105, 164,  64, 130,  80 }, /*left = d63 */
+		{  43,  81,  53, 140, 169, 204,  68,  84,  72 }, /*left = tm  */
+	}
+};
+EXPORT_SYMBOL_GPL(v4l2_vp9_kf_y_mode_prob);
+
+const u8 v4l2_vp9_kf_partition_probs[16][3] = {
+	/* 8x8 -> 4x4 */
+	{ 158,  97,  94 },	/* a/l both not split   */
+	{  93,  24,  99 },	/* a split, l not split */
+	{  85, 119,  44 },	/* l split, a not split */
+	{  62,  59,  67 },	/* a/l both split       */
+	/* 16x16 -> 8x8 */
+	{ 149,  53,  53 },	/* a/l both not split   */
+	{  94,  20,  48 },	/* a split, l not split */
+	{  83,  53,  24 },	/* l split, a not split */
+	{  52,  18,  18 },	/* a/l both split       */
+	/* 32x32 -> 16x16 */
+	{ 150,  40,  39 },	/* a/l both not split   */
+	{  78,  12,  26 },	/* a split, l not split */
+	{  67,  33,  11 },	/* l split, a not split */
+	{  24,   7,   5 },	/* a/l both split       */
+	/* 64x64 -> 32x32 */
+	{ 174,  35,  49 },	/* a/l both not split   */
+	{  68,  11,  27 },	/* a split, l not split */
+	{  57,  15,   9 },	/* l split, a not split */
+	{  12,   3,   3 },	/* a/l both split       */
+};
+EXPORT_SYMBOL_GPL(v4l2_vp9_kf_partition_probs);
+
+const u8 v4l2_vp9_kf_uv_mode_prob[10][9] = {
+	{ 144,  11,  54, 157, 195, 130,  46,  58, 108 },  /* y = dc   */
+	{ 118,  15, 123, 148, 131, 101,  44,  93, 131 },  /* y = v    */
+	{ 113,  12,  23, 188, 226, 142,  26,  32, 125 },  /* y = h    */
+	{ 120,  11,  50, 123, 163, 135,  64,  77, 103 },  /* y = d45  */
+	{ 113,   9,  36, 155, 111, 157,  32,  44, 161 },  /* y = d135 */
+	{ 116,   9,  55, 176,  76,  96,  37,  61, 149 },  /* y = d117 */
+	{ 115,   9,  28, 141, 161, 167,  21,  25, 193 },  /* y = d153 */
+	{ 120,  12,  32, 145, 195, 142,  32,  38,  86 },  /* y = d207 */
+	{ 116,  12,  64, 120, 140, 125,  49, 115, 121 },  /* y = d63  */
+	{ 102,  19,  66, 162, 182, 122,  35,  59, 128 }   /* y = tm   */
+};
+EXPORT_SYMBOL_GPL(v4l2_vp9_kf_uv_mode_prob);
+
+const struct v4l2_vp9_frame_context v4l2_vp9_default_probs = {
+	.tx8 = {
+		{ 100 },
+		{  66 },
+	},
+	.tx16 = {
+		{ 20, 152 },
+		{ 15, 101 },
+	},
+	.tx32 = {
+		{ 3, 136, 37 },
+		{ 5,  52, 13 },
+	},
+	.coef = {
+		{ /* tx = 4x4 */
+			{ /* block Type 0 */
+				{ /* Intra */
+					{ /* Coeff Band 0 */
+						{ 195,  29, 183 },
+						{  84,  49, 136 },
+						{   8,  42,  71 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+					},
+					{ /* Coeff Band 1 */
+						{  31, 107, 169 },
+						{  35,  99, 159 },
+						{  17,  82, 140 },
+						{   8,  66, 114 },
+						{   2,  44,  76 },
+						{   1,  19,  32 },
+					},
+					{ /* Coeff Band 2 */
+						{  40, 132, 201 },
+						{  29, 114, 187 },
+						{  13,  91, 157 },
+						{   7,  75, 127 },
+						{   3,  58,  95 },
+						{   1,  28,  47 },
+					},
+					{ /* Coeff Band 3 */
+						{  69, 142, 221 },
+						{  42, 122, 201 },
+						{  15,  91, 159 },
+						{   6,  67, 121 },
+						{   1,  42,  77 },
+						{   1,  17,  31 },
+					},
+					{ /* Coeff Band 4 */
+						{ 102, 148, 228 },
+						{  67, 117, 204 },
+						{  17,  82, 154 },
+						{   6,  59, 114 },
+						{   2,  39,  75 },
+						{   1,  15,  29 },
+					},
+					{ /* Coeff Band 5 */
+						{ 156,  57, 233 },
+						{ 119,  57, 212 },
+						{  58,  48, 163 },
+						{  29,  40, 124 },
+						{  12,  30,  81 },
+						{   3,  12,  31 }
+					},
+				},
+				{ /* Inter */
+					{ /* Coeff Band 0 */
+						{ 191, 107, 226 },
+						{ 124, 117, 204 },
+						{  25,  99, 155 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+					},
+					{ /* Coeff Band 1 */
+						{  29, 148, 210 },
+						{  37, 126, 194 },
+						{   8,  93, 157 },
+						{   2,  68, 118 },
+						{   1,  39,  69 },
+						{   1,  17,  33 },
+					},
+					{ /* Coeff Band 2 */
+						{  41, 151, 213 },
+						{  27, 123, 193 },
+						{   3,  82, 144 },
+						{   1,  58, 105 },
+						{   1,  32,  60 },
+						{   1,  13,  26 },
+					},
+					{ /* Coeff Band 3 */
+						{  59, 159, 220 },
+						{  23, 126, 198 },
+						{   4,  88, 151 },
+						{   1,  66, 114 },
+						{   1,  38,  71 },
+						{   1,  18,  34 },
+					},
+					{ /* Coeff Band 4 */
+						{ 114, 136, 232 },
+						{  51, 114, 207 },
+						{  11,  83, 155 },
+						{   3,  56, 105 },
+						{   1,  33,  65 },
+						{   1,  17,  34 },
+					},
+					{ /* Coeff Band 5 */
+						{ 149,  65, 234 },
+						{ 121,  57, 215 },
+						{  61,  49, 166 },
+						{  28,  36, 114 },
+						{  12,  25,  76 },
+						{   3,  16,  42 },
+					},
+				},
+			},
+			{ /* block Type 1 */
+				{ /* Intra */
+					{ /* Coeff Band 0 */
+						{ 214,  49, 220 },
+						{ 132,  63, 188 },
+						{  42,  65, 137 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+					},
+					{ /* Coeff Band 1 */
+						{  85, 137, 221 },
+						{ 104, 131, 216 },
+						{  49, 111, 192 },
+						{  21,  87, 155 },
+						{   2,  49,  87 },
+						{   1,  16,  28 },
+					},
+					{ /* Coeff Band 2 */
+						{  89, 163, 230 },
+						{  90, 137, 220 },
+						{  29, 100, 183 },
+						{  10,  70, 135 },
+						{   2,  42,  81 },
+						{   1,  17,  33 },
+					},
+					{ /* Coeff Band 3 */
+						{ 108, 167, 237 },
+						{  55, 133, 222 },
+						{  15,  97, 179 },
+						{   4,  72, 135 },
+						{   1,  45,  85 },
+						{   1,  19,  38 },
+					},
+					{ /* Coeff Band 4 */
+						{ 124, 146, 240 },
+						{  66, 124, 224 },
+						{  17,  88, 175 },
+						{   4,  58, 122 },
+						{   1,  36,  75 },
+						{   1,  18,  37 },
+					},
+					{ /* Coeff Band 5 */
+						{ 141,  79, 241 },
+						{ 126,  70, 227 },
+						{  66,  58, 182 },
+						{  30,  44, 136 },
+						{  12,  34,  96 },
+						{   2,  20,  47 },
+					},
+				},
+				{ /* Inter */
+					{ /* Coeff Band 0 */
+						{ 229,  99, 249 },
+						{ 143, 111, 235 },
+						{  46, 109, 192 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+					},
+					{ /* Coeff Band 1 */
+						{  82, 158, 236 },
+						{  94, 146, 224 },
+						{  25, 117, 191 },
+						{   9,  87, 149 },
+						{   3,  56,  99 },
+						{   1,  33,  57 },
+					},
+					{ /* Coeff Band 2 */
+						{  83, 167, 237 },
+						{  68, 145, 222 },
+						{  10, 103, 177 },
+						{   2,  72, 131 },
+						{   1,  41,  79 },
+						{   1,  20,  39 },
+					},
+					{ /* Coeff Band 3 */
+						{  99, 167, 239 },
+						{  47, 141, 224 },
+						{  10, 104, 178 },
+						{   2,  73, 133 },
+						{   1,  44,  85 },
+						{   1,  22,  47 },
+					},
+					{ /* Coeff Band 4 */
+						{ 127, 145, 243 },
+						{  71, 129, 228 },
+						{  17,  93, 177 },
+						{   3,  61, 124 },
+						{   1,  41,  84 },
+						{   1,  21,  52 },
+					},
+					{ /* Coeff Band 5 */
+						{ 157,  78, 244 },
+						{ 140,  72, 231 },
+						{  69,  58, 184 },
+						{  31,  44, 137 },
+						{  14,  38, 105 },
+						{   8,  23,  61 },
+					},
+				},
+			},
+		},
+		{ /* tx = 8x8 */
+			{ /* block Type 0 */
+				{ /* Intra */
+					{ /* Coeff Band 0 */
+						{ 125,  34, 187 },
+						{  52,  41, 133 },
+						{   6,  31,  56 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+					},
+					{ /* Coeff Band 1 */
+						{  37, 109, 153 },
+						{  51, 102, 147 },
+						{  23,  87, 128 },
+						{   8,  67, 101 },
+						{   1,  41,  63 },
+						{   1,  19,  29 },
+					},
+					{ /* Coeff Band 2 */
+						{  31, 154, 185 },
+						{  17, 127, 175 },
+						{   6,  96, 145 },
+						{   2,  73, 114 },
+						{   1,  51,  82 },
+						{   1,  28,  45 },
+					},
+					{ /* Coeff Band 3 */
+						{  23, 163, 200 },
+						{  10, 131, 185 },
+						{   2,  93, 148 },
+						{   1,  67, 111 },
+						{   1,  41,  69 },
+						{   1,  14,  24 },
+					},
+					{ /* Coeff Band 4 */
+						{  29, 176, 217 },
+						{  12, 145, 201 },
+						{   3, 101, 156 },
+						{   1,  69, 111 },
+						{   1,  39,  63 },
+						{   1,  14,  23 },
+					},
+					{ /* Coeff Band 5 */
+						{  57, 192, 233 },
+						{  25, 154, 215 },
+						{   6, 109, 167 },
+						{   3,  78, 118 },
+						{   1,  48,  69 },
+						{   1,  21,  29 },
+					},
+				},
+				{ /* Inter */
+					{ /* Coeff Band 0 */
+						{ 202, 105, 245 },
+						{ 108, 106, 216 },
+						{  18,  90, 144 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+					},
+					{ /* Coeff Band 1 */
+						{  33, 172, 219 },
+						{  64, 149, 206 },
+						{  14, 117, 177 },
+						{   5,  90, 141 },
+						{   2,  61,  95 },
+						{   1,  37,  57 },
+					},
+					{ /* Coeff Band 2 */
+						{  33, 179, 220 },
+						{  11, 140, 198 },
+						{   1,  89, 148 },
+						{   1,  60, 104 },
+						{   1,  33,  57 },
+						{   1,  12,  21 },
+					},
+					{ /* Coeff Band 3 */
+						{  30, 181, 221 },
+						{   8, 141, 198 },
+						{   1,  87, 145 },
+						{   1,  58, 100 },
+						{   1,  31,  55 },
+						{   1,  12,  20 },
+					},
+					{ /* Coeff Band 4 */
+						{  32, 186, 224 },
+						{   7, 142, 198 },
+						{   1,  86, 143 },
+						{   1,  58, 100 },
+						{   1,  31,  55 },
+						{   1,  12,  22 },
+					},
+					{ /* Coeff Band 5 */
+						{  57, 192, 227 },
+						{  20, 143, 204 },
+						{   3,  96, 154 },
+						{   1,  68, 112 },
+						{   1,  42,  69 },
+						{   1,  19,  32 },
+					},
+				},
+			},
+			{ /* block Type 1 */
+				{ /* Intra */
+					{ /* Coeff Band 0 */
+						{ 212,  35, 215 },
+						{ 113,  47, 169 },
+						{  29,  48, 105 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+					},
+					{ /* Coeff Band 1 */
+						{  74, 129, 203 },
+						{ 106, 120, 203 },
+						{  49, 107, 178 },
+						{  19,  84, 144 },
+						{   4,  50,  84 },
+						{   1,  15,  25 },
+					},
+					{ /* Coeff Band 2 */
+						{  71, 172, 217 },
+						{  44, 141, 209 },
+						{  15, 102, 173 },
+						{   6,  76, 133 },
+						{   2,  51,  89 },
+						{   1,  24,  42 },
+					},
+					{ /* Coeff Band 3 */
+						{  64, 185, 231 },
+						{  31, 148, 216 },
+						{   8, 103, 175 },
+						{   3,  74, 131 },
+						{   1,  46,  81 },
+						{   1,  18,  30 },
+					},
+					{ /* Coeff Band 4 */
+						{  65, 196, 235 },
+						{  25, 157, 221 },
+						{   5, 105, 174 },
+						{   1,  67, 120 },
+						{   1,  38,  69 },
+						{   1,  15,  30 },
+					},
+					{ /* Coeff Band 5 */
+						{  65, 204, 238 },
+						{  30, 156, 224 },
+						{   7, 107, 177 },
+						{   2,  70, 124 },
+						{   1,  42,  73 },
+						{   1,  18,  34 },
+					},
+				},
+				{ /* Inter */
+					{ /* Coeff Band 0 */
+						{ 225,  86, 251 },
+						{ 144, 104, 235 },
+						{  42,  99, 181 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+					},
+					{ /* Coeff Band 1 */
+						{  85, 175, 239 },
+						{ 112, 165, 229 },
+						{  29, 136, 200 },
+						{  12, 103, 162 },
+						{   6,  77, 123 },
+						{   2,  53,  84 },
+					},
+					{ /* Coeff Band 2 */
+						{  75, 183, 239 },
+						{  30, 155, 221 },
+						{   3, 106, 171 },
+						{   1,  74, 128 },
+						{   1,  44,  76 },
+						{   1,  17,  28 },
+					},
+					{ /* Coeff Band 3 */
+						{  73, 185, 240 },
+						{  27, 159, 222 },
+						{   2, 107, 172 },
+						{   1,  75, 127 },
+						{   1,  42,  73 },
+						{   1,  17,  29 },
+					},
+					{ /* Coeff Band 4 */
+						{  62, 190, 238 },
+						{  21, 159, 222 },
+						{   2, 107, 172 },
+						{   1,  72, 122 },
+						{   1,  40,  71 },
+						{   1,  18,  32 },
+					},
+					{ /* Coeff Band 5 */
+						{  61, 199, 240 },
+						{  27, 161, 226 },
+						{   4, 113, 180 },
+						{   1,  76, 129 },
+						{   1,  46,  80 },
+						{   1,  23,  41 },
+					},
+				},
+			},
+		},
+		{ /* tx = 16x16 */
+			{ /* block Type 0 */
+				{ /* Intra */
+					{ /* Coeff Band 0 */
+						{   7,  27, 153 },
+						{   5,  30,  95 },
+						{   1,  16,  30 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+					},
+					{ /* Coeff Band 1 */
+						{  50,  75, 127 },
+						{  57,  75, 124 },
+						{  27,  67, 108 },
+						{  10,  54,  86 },
+						{   1,  33,  52 },
+						{   1,  12,  18 },
+					},
+					{ /* Coeff Band 2 */
+						{  43, 125, 151 },
+						{  26, 108, 148 },
+						{   7,  83, 122 },
+						{   2,  59,  89 },
+						{   1,  38,  60 },
+						{   1,  17,  27 },
+					},
+					{ /* Coeff Band 3 */
+						{  23, 144, 163 },
+						{  13, 112, 154 },
+						{   2,  75, 117 },
+						{   1,  50,  81 },
+						{   1,  31,  51 },
+						{   1,  14,  23 },
+					},
+					{ /* Coeff Band 4 */
+						{  18, 162, 185 },
+						{   6, 123, 171 },
+						{   1,  78, 125 },
+						{   1,  51,  86 },
+						{   1,  31,  54 },
+						{   1,  14,  23 },
+					},
+					{ /* Coeff Band 5 */
+						{  15, 199, 227 },
+						{   3, 150, 204 },
+						{   1,  91, 146 },
+						{   1,  55,  95 },
+						{   1,  30,  53 },
+						{   1,  11,  20 },
+					}
+				},
+				{ /* Inter */
+					{ /* Coeff Band 0 */
+						{  19,  55, 240 },
+						{  19,  59, 196 },
+						{   3,  52, 105 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+					},
+					{ /* Coeff Band 1 */
+						{  41, 166, 207 },
+						{ 104, 153, 199 },
+						{  31, 123, 181 },
+						{  14, 101, 152 },
+						{   5,  72, 106 },
+						{   1,  36,  52 },
+					},
+					{ /* Coeff Band 2 */
+						{  35, 176, 211 },
+						{  12, 131, 190 },
+						{   2,  88, 144 },
+						{   1,  60, 101 },
+						{   1,  36,  60 },
+						{   1,  16,  28 },
+					},
+					{ /* Coeff Band 3 */
+						{  28, 183, 213 },
+						{   8, 134, 191 },
+						{   1,  86, 142 },
+						{   1,  56,  96 },
+						{   1,  30,  53 },
+						{   1,  12,  20 },
+					},
+					{ /* Coeff Band 4 */
+						{  20, 190, 215 },
+						{   4, 135, 192 },
+						{   1,  84, 139 },
+						{   1,  53,  91 },
+						{   1,  28,  49 },
+						{   1,  11,  20 },
+					},
+					{ /* Coeff Band 5 */
+						{  13, 196, 216 },
+						{   2, 137, 192 },
+						{   1,  86, 143 },
+						{   1,  57,  99 },
+						{   1,  32,  56 },
+						{   1,  13,  24 },
+					},
+				},
+			},
+			{ /* block Type 1 */
+				{ /* Intra */
+					{ /* Coeff Band 0 */
+						{ 211,  29, 217 },
+						{  96,  47, 156 },
+						{  22,  43,  87 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+					},
+					{ /* Coeff Band 1 */
+						{  78, 120, 193 },
+						{ 111, 116, 186 },
+						{  46, 102, 164 },
+						{  15,  80, 128 },
+						{   2,  49,  76 },
+						{   1,  18,  28 },
+					},
+					{ /* Coeff Band 2 */
+						{  71, 161, 203 },
+						{  42, 132, 192 },
+						{  10,  98, 150 },
+						{   3,  69, 109 },
+						{   1,  44,  70 },
+						{   1,  18,  29 },
+					},
+					{ /* Coeff Band 3 */
+						{  57, 186, 211 },
+						{  30, 140, 196 },
+						{   4,  93, 146 },
+						{   1,  62, 102 },
+						{   1,  38,  65 },
+						{   1,  16,  27 },
+					},
+					{ /* Coeff Band 4 */
+						{  47, 199, 217 },
+						{  14, 145, 196 },
+						{   1,  88, 142 },
+						{   1,  57,  98 },
+						{   1,  36,  62 },
+						{   1,  15,  26 },
+					},
+					{ /* Coeff Band 5 */
+						{  26, 219, 229 },
+						{   5, 155, 207 },
+						{   1,  94, 151 },
+						{   1,  60, 104 },
+						{   1,  36,  62 },
+						{   1,  16,  28 },
+					}
+				},
+				{ /* Inter */
+					{ /* Coeff Band 0 */
+						{ 233,  29, 248 },
+						{ 146,  47, 220 },
+						{  43,  52, 140 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+					},
+					{ /* Coeff Band 1 */
+						{ 100, 163, 232 },
+						{ 179, 161, 222 },
+						{  63, 142, 204 },
+						{  37, 113, 174 },
+						{  26,  89, 137 },
+						{  18,  68,  97 },
+					},
+					{ /* Coeff Band 2 */
+						{  85, 181, 230 },
+						{  32, 146, 209 },
+						{   7, 100, 164 },
+						{   3,  71, 121 },
+						{   1,  45,  77 },
+						{   1,  18,  30 },
+					},
+					{ /* Coeff Band 3 */
+						{  65, 187, 230 },
+						{  20, 148, 207 },
+						{   2,  97, 159 },
+						{   1,  68, 116 },
+						{   1,  40,  70 },
+						{   1,  14,  29 },
+					},
+					{ /* Coeff Band 4 */
+						{  40, 194, 227 },
+						{   8, 147, 204 },
+						{   1,  94, 155 },
+						{   1,  65, 112 },
+						{   1,  39,  66 },
+						{   1,  14,  26 },
+					},
+					{ /* Coeff Band 5 */
+						{  16, 208, 228 },
+						{   3, 151, 207 },
+						{   1,  98, 160 },
+						{   1,  67, 117 },
+						{   1,  41,  74 },
+						{   1,  17,  31 },
+					},
+				},
+			},
+		},
+		{ /* tx = 32x32 */
+			{ /* block Type 0 */
+				{ /* Intra */
+					{ /* Coeff Band 0 */
+						{  17,  38, 140 },
+						{   7,  34,  80 },
+						{   1,  17,  29 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+					},
+					{ /* Coeff Band 1 */
+						{  37,  75, 128 },
+						{  41,  76, 128 },
+						{  26,  66, 116 },
+						{  12,  52,  94 },
+						{   2,  32,  55 },
+						{   1,  10,  16 },
+					},
+					{ /* Coeff Band 2 */
+						{  50, 127, 154 },
+						{  37, 109, 152 },
+						{  16,  82, 121 },
+						{   5,  59,  85 },
+						{   1,  35,  54 },
+						{   1,  13,  20 },
+					},
+					{ /* Coeff Band 3 */
+						{  40, 142, 167 },
+						{  17, 110, 157 },
+						{   2,  71, 112 },
+						{   1,  44,  72 },
+						{   1,  27,  45 },
+						{   1,  11,  17 },
+					},
+					{ /* Coeff Band 4 */
+						{  30, 175, 188 },
+						{   9, 124, 169 },
+						{   1,  74, 116 },
+						{   1,  48,  78 },
+						{   1,  30,  49 },
+						{   1,  11,  18 },
+					},
+					{ /* Coeff Band 5 */
+						{  10, 222, 223 },
+						{   2, 150, 194 },
+						{   1,  83, 128 },
+						{   1,  48,  79 },
+						{   1,  27,  45 },
+						{   1,  11,  17 },
+					},
+				},
+				{ /* Inter */
+					{ /* Coeff Band 0 */
+						{  36,  41, 235 },
+						{  29,  36, 193 },
+						{  10,  27, 111 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+					},
+					{ /* Coeff Band 1 */
+						{  85, 165, 222 },
+						{ 177, 162, 215 },
+						{ 110, 135, 195 },
+						{  57, 113, 168 },
+						{  23,  83, 120 },
+						{  10,  49,  61 },
+					},
+					{ /* Coeff Band 2 */
+						{  85, 190, 223 },
+						{  36, 139, 200 },
+						{   5,  90, 146 },
+						{   1,  60, 103 },
+						{   1,  38,  65 },
+						{   1,  18,  30 },
+					},
+					{ /* Coeff Band 3 */
+						{  72, 202, 223 },
+						{  23, 141, 199 },
+						{   2,  86, 140 },
+						{   1,  56,  97 },
+						{   1,  36,  61 },
+						{   1,  16,  27 },
+					},
+					{ /* Coeff Band 4 */
+						{  55, 218, 225 },
+						{  13, 145, 200 },
+						{   1,  86, 141 },
+						{   1,  57,  99 },
+						{   1,  35,  61 },
+						{   1,  13,  22 },
+					},
+					{ /* Coeff Band 5 */
+						{  15, 235, 212 },
+						{   1, 132, 184 },
+						{   1,  84, 139 },
+						{   1,  57,  97 },
+						{   1,  34,  56 },
+						{   1,  14,  23 },
+					},
+				},
+			},
+			{ /* block Type 1 */
+				{ /* Intra */
+					{ /* Coeff Band 0 */
+						{ 181,  21, 201 },
+						{  61,  37, 123 },
+						{  10,  38,  71 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+					},
+					{ /* Coeff Band 1 */
+						{  47, 106, 172 },
+						{  95, 104, 173 },
+						{  42,  93, 159 },
+						{  18,  77, 131 },
+						{   4,  50,  81 },
+						{   1,  17,  23 },
+					},
+					{ /* Coeff Band 2 */
+						{  62, 147, 199 },
+						{  44, 130, 189 },
+						{  28, 102, 154 },
+						{  18,  75, 115 },
+						{   2,  44,  65 },
+						{   1,  12,  19 },
+					},
+					{ /* Coeff Band 3 */
+						{  55, 153, 210 },
+						{  24, 130, 194 },
+						{   3,  93, 146 },
+						{   1,  61,  97 },
+						{   1,  31,  50 },
+						{   1,  10,  16 },
+					},
+					{ /* Coeff Band 4 */
+						{  49, 186, 223 },
+						{  17, 148, 204 },
+						{   1,  96, 142 },
+						{   1,  53,  83 },
+						{   1,  26,  44 },
+						{   1,  11,  17 },
+					},
+					{ /* Coeff Band 5 */
+						{  13, 217, 212 },
+						{   2, 136, 180 },
+						{   1,  78, 124 },
+						{   1,  50,  83 },
+						{   1,  29,  49 },
+						{   1,  14,  23 },
+					},
+				},
+				{ /* Inter */
+					{ /* Coeff Band 0 */
+						{ 197,  13, 247 },
+						{  82,  17, 222 },
+						{  25,  17, 162 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+						{   0,   0,   0 },
+					},
+					{ /* Coeff Band 1 */
+						{ 126, 186, 247 },
+						{ 234, 191, 243 },
+						{ 176, 177, 234 },
+						{ 104, 158, 220 },
+						{  66, 128, 186 },
+						{  55,  90, 137 },
+					},
+					{ /* Coeff Band 2 */
+						{ 111, 197, 242 },
+						{  46, 158, 219 },
+						{   9, 104, 171 },
+						{   2,  65, 125 },
+						{   1,  44,  80 },
+						{   1,  17,  91 },
+					},
+					{ /* Coeff Band 3 */
+						{ 104, 208, 245 },
+						{  39, 168, 224 },
+						{   3, 109, 162 },
+						{   1,  79, 124 },
+						{   1,  50, 102 },
+						{   1,  43, 102 },
+					},
+					{ /* Coeff Band 4 */
+						{  84, 220, 246 },
+						{  31, 177, 231 },
+						{   2, 115, 180 },
+						{   1,  79, 134 },
+						{   1,  55,  77 },
+						{   1,  60,  79 },
+					},
+					{ /* Coeff Band 5 */
+						{  43, 243, 240 },
+						{   8, 180, 217 },
+						{   1, 115, 166 },
+						{   1,  84, 121 },
+						{   1,  51,  67 },
+						{   1,  16,   6 },
+					},
+				},
+			},
+		},
+	},
+
+	.skip = { 192, 128, 64 },
+	.inter_mode = {
+		{  2, 173, 34 },
+		{  7, 145, 85 },
+		{  7, 166, 63 },
+		{  7,  94, 66 },
+		{  8,  64, 46 },
+		{ 17,  81, 31 },
+		{ 25,  29, 30 },
+	},
+	.interp_filter = {
+		{ 235, 162 },
+		{  36, 255 },
+		{  34,   3 },
+		{ 149, 144 },
+	},
+	.is_inter = { 9, 102, 187, 225 },
+	.comp_mode = { 239, 183, 119, 96, 41 },
+	.single_ref = {
+		{  33,  16 },
+		{  77,  74 },
+		{ 142, 142 },
+		{ 172, 170 },
+		{ 238, 247 },
+	},
+	.comp_ref = { 50, 126, 123, 221, 226 },
+	.y_mode = {
+		{  65,  32, 18, 144, 162, 194, 41, 51, 98 },
+		{ 132,  68, 18, 165, 217, 196, 45, 40, 78 },
+		{ 173,  80, 19, 176, 240, 193, 64, 35, 46 },
+		{ 221, 135, 38, 194, 248, 121, 96, 85, 29 },
+	},
+	.uv_mode = {
+		{ 120,   7,  76, 176, 208, 126,  28,  54, 103 } /* y = dc */,
+		{  48,  12, 154, 155, 139,  90,  34, 117, 119 } /* y = v */,
+		{  67,   6,  25, 204, 243, 158,  13,  21,  96 } /* y = h */,
+		{  97,   5,  44, 131, 176, 139,  48,  68,  97 } /* y = d45 */,
+		{  83,   5,  42, 156, 111, 152,  26,  49, 152 } /* y = d135 */,
+		{  80,   5,  58, 178,  74,  83,  33,  62, 145 } /* y = d117 */,
+		{  86,   5,  32, 154, 192, 168,  14,  22, 163 } /* y = d153 */,
+		{  85,   5,  32, 156, 216, 148,  19,  29,  73 } /* y = d207 */,
+		{  77,   7,  64, 116, 132, 122,  37, 126, 120 } /* y = d63 */,
+		{ 101,  21, 107, 181, 192, 103,  19,  67, 125 } /* y = tm */
+	},
+	.partition = {
+		/* 8x8 -> 4x4 */
+		{ 199, 122, 141 } /* a/l both not split */,
+		{ 147,  63, 159 } /* a split, l not split */,
+		{ 148, 133, 118 } /* l split, a not split */,
+		{ 121, 104, 114 } /* a/l both split */,
+		/* 16x16 -> 8x8 */
+		{ 174,  73,  87 } /* a/l both not split */,
+		{  92,  41,  83 } /* a split, l not split */,
+		{  82,  99,  50 } /* l split, a not split */,
+		{  53,  39,  39 } /* a/l both split */,
+		/* 32x32 -> 16x16 */
+		{ 177,  58,  59 } /* a/l both not split */,
+		{  68,  26,  63 } /* a split, l not split */,
+		{  52,  79,  25 } /* l split, a not split */,
+		{  17,  14,  12 } /* a/l both split */,
+		/* 64x64 -> 32x32 */
+		{ 222,  34,  30 } /* a/l both not split */,
+		{  72,  16,  44 } /* a split, l not split */,
+		{  58,  32,  12 } /* l split, a not split */,
+		{  10,   7,   6 } /* a/l both split */,
+	},
+
+	.mv = {
+		.joint = { 32, 64, 96 },
+		.sign = { 128, 128 },
+		.classes = {
+			{ 224, 144, 192, 168, 192, 176, 192, 198, 198, 245 },
+			{ 216, 128, 176, 160, 176, 176, 192, 198, 198, 208 },
+		},
+		.class0_bit = { 216, 208 },
+		.bits = {
+			{ 136, 140, 148, 160, 176, 192, 224, 234, 234, 240},
+			{ 136, 140, 148, 160, 176, 192, 224, 234, 234, 240},
+		},
+		.class0_fr = {
+			{
+				{ 128, 128, 64 },
+				{  96, 112, 64 },
+			},
+			{
+				{ 128, 128, 64 },
+				{  96, 112, 64 },
+			},
+		},
+		.fr = {
+			{ 64, 96, 64 },
+			{ 64, 96, 64 },
+		},
+		.class0_hp = { 160, 160 },
+		.hp = { 128, 128 },
+	},
+};
+EXPORT_SYMBOL_GPL(v4l2_vp9_default_probs);
+
+static u32 fastdiv(u32 dividend, u16 divisor)
+{
+#define DIV_INV(d)	((u32)(((1ULL << 32) + ((d) - 1)) / (d)))
+#define DIVS_INV(d0, d1, d2, d3, d4, d5, d6, d7, d8, d9)	\
+	DIV_INV(d0), DIV_INV(d1), DIV_INV(d2), DIV_INV(d3),	\
+	DIV_INV(d4), DIV_INV(d5), DIV_INV(d6), DIV_INV(d7),	\
+	DIV_INV(d8), DIV_INV(d9)
+
+	static const u32 inv[] = {
+		DIV_INV(2), DIV_INV(3), DIV_INV(4), DIV_INV(5),
+		DIV_INV(6), DIV_INV(7), DIV_INV(8), DIV_INV(9),
+		DIVS_INV(10, 11, 12, 13, 14, 15, 16, 17, 18, 19),
+		DIVS_INV(20, 21, 22, 23, 24, 25, 26, 27, 28, 29),
+		DIVS_INV(30, 31, 32, 33, 34, 35, 36, 37, 38, 39),
+		DIVS_INV(40, 41, 42, 43, 44, 45, 46, 47, 48, 49),
+		DIVS_INV(50, 51, 52, 53, 54, 55, 56, 57, 58, 59),
+		DIVS_INV(60, 61, 62, 63, 64, 65, 66, 67, 68, 69),
+		DIVS_INV(70, 71, 72, 73, 74, 75, 76, 77, 78, 79),
+		DIVS_INV(80, 81, 82, 83, 84, 85, 86, 87, 88, 89),
+		DIVS_INV(90, 91, 92, 93, 94, 95, 96, 97, 98, 99),
+		DIVS_INV(100, 101, 102, 103, 104, 105, 106, 107, 108, 109),
+		DIVS_INV(110, 111, 112, 113, 114, 115, 116, 117, 118, 119),
+		DIVS_INV(120, 121, 122, 123, 124, 125, 126, 127, 128, 129),
+		DIVS_INV(130, 131, 132, 133, 134, 135, 136, 137, 138, 139),
+		DIVS_INV(140, 141, 142, 143, 144, 145, 146, 147, 148, 149),
+		DIVS_INV(150, 151, 152, 153, 154, 155, 156, 157, 158, 159),
+		DIVS_INV(160, 161, 162, 163, 164, 165, 166, 167, 168, 169),
+		DIVS_INV(170, 171, 172, 173, 174, 175, 176, 177, 178, 179),
+		DIVS_INV(180, 181, 182, 183, 184, 185, 186, 187, 188, 189),
+		DIVS_INV(190, 191, 192, 193, 194, 195, 196, 197, 198, 199),
+		DIVS_INV(200, 201, 202, 203, 204, 205, 206, 207, 208, 209),
+		DIVS_INV(210, 211, 212, 213, 214, 215, 216, 217, 218, 219),
+		DIVS_INV(220, 221, 222, 223, 224, 225, 226, 227, 228, 229),
+		DIVS_INV(230, 231, 232, 233, 234, 235, 236, 237, 238, 239),
+		DIVS_INV(240, 241, 242, 243, 244, 245, 246, 247, 248, 249),
+		DIV_INV(250), DIV_INV(251), DIV_INV(252), DIV_INV(253),
+		DIV_INV(254), DIV_INV(255), DIV_INV(256),
+	};
+
+	if (divisor == 0)
+		return 0;
+	else if (divisor == 1)
+		return dividend;
+
+	if (WARN_ON(divisor - 2 >= ARRAY_SIZE(inv)))
+		return dividend;
+
+	return ((u64)dividend * inv[divisor - 2]) >> 32;
+}
+
+/* 6.3.6 inv_recenter_nonneg(v, m) */
+static int inv_recenter_nonneg(int v, int m)
+{
+	if (v > 2 * m)
+		return v;
+
+	if (v & 1)
+		return m - ((v + 1) >> 1);
+
+	return m + (v >> 1);
+}
+
+/*
+ * part of 6.3.5 inv_remap_prob(deltaProb, prob)
+ * delta = inv_map_table[deltaProb] done by userspace
+ */
+static int update_prob(int delta, int prob)
+{
+	if (!delta)
+		return prob;
+
+	return prob <= 128 ?
+		1 + inv_recenter_nonneg(delta, prob - 1) :
+		255 - inv_recenter_nonneg(delta, 255 - prob);
+}
+
+/* Counterpart to 6.3.2 tx_mode_probs() */
+static void update_tx_probs(struct v4l2_vp9_frame_context *probs,
+			    const struct v4l2_ctrl_vp9_compressed_hdr *deltas)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(probs->tx8); i++) {
+		u8 *p8x8 = probs->tx8[i];
+		u8 *p16x16 = probs->tx16[i];
+		u8 *p32x32 = probs->tx32[i];
+		const u8 *d8x8 = deltas->tx8[i];
+		const u8 *d16x16 = deltas->tx16[i];
+		const u8 *d32x32 = deltas->tx32[i];
+
+		p8x8[0] = update_prob(d8x8[0], p8x8[0]);
+		p16x16[0] = update_prob(d16x16[0], p16x16[0]);
+		p16x16[1] = update_prob(d16x16[1], p16x16[1]);
+		p32x32[0] = update_prob(d32x32[0], p32x32[0]);
+		p32x32[1] = update_prob(d32x32[1], p32x32[1]);
+		p32x32[2] = update_prob(d32x32[2], p32x32[2]);
+	}
+}
+
+#define BAND_6(band) ((band) == 0 ? 3 : 6)
+
+static void update_coeff(const u8 deltas[6][6][3], u8 probs[6][6][3])
+{
+	int l, m, n;
+
+	for (l = 0; l < 6; l++)
+		for (m = 0; m < BAND_6(l); m++) {
+			u8 *p = probs[l][m];
+			const u8 *d = deltas[l][m];
+
+			for (n = 0; n < 3; n++)
+				p[n] = update_prob(d[n], p[n]);
+		}
+}
+
+/* Counterpart to 6.3.7 read_coef_probs() */
+static void update_coef_probs(struct v4l2_vp9_frame_context *probs,
+			      const struct v4l2_ctrl_vp9_compressed_hdr *deltas,
+			      const struct v4l2_ctrl_vp9_frame *dec_params)
+{
+	int i, j, k;
+
+	for (i = 0; i < ARRAY_SIZE(probs->coef); i++) {
+		for (j = 0; j < ARRAY_SIZE(probs->coef[0]); j++)
+			for (k = 0; k < ARRAY_SIZE(probs->coef[0][0]); k++)
+				update_coeff(deltas->coef[i][j][k], probs->coef[i][j][k]);
+
+		if (deltas->tx_mode == i)
+			break;
+	}
+}
+
+/* Counterpart to 6.3.8 read_skip_prob() */
+static void update_skip_probs(struct v4l2_vp9_frame_context *probs,
+			      const struct v4l2_ctrl_vp9_compressed_hdr *deltas)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(probs->skip); i++)
+		probs->skip[i] = update_prob(deltas->skip[i], probs->skip[i]);
+}
+
+/* Counterpart to 6.3.9 read_inter_mode_probs() */
+static void update_inter_mode_probs(struct v4l2_vp9_frame_context *probs,
+				    const struct v4l2_ctrl_vp9_compressed_hdr *deltas)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(probs->inter_mode); i++) {
+		u8 *p = probs->inter_mode[i];
+		const u8 *d = deltas->inter_mode[i];
+
+		p[0] = update_prob(d[0], p[0]);
+		p[1] = update_prob(d[1], p[1]);
+		p[2] = update_prob(d[2], p[2]);
+	}
+}
+
+/* Counterpart to 6.3.10 read_interp_filter_probs() */
+static void update_interp_filter_probs(struct v4l2_vp9_frame_context *probs,
+				       const struct v4l2_ctrl_vp9_compressed_hdr *deltas)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(probs->interp_filter); i++) {
+		u8 *p = probs->interp_filter[i];
+		const u8 *d = deltas->interp_filter[i];
+
+		p[0] = update_prob(d[0], p[0]);
+		p[1] = update_prob(d[1], p[1]);
+	}
+}
+
+/* Counterpart to 6.3.11 read_is_inter_probs() */
+static void update_is_inter_probs(struct v4l2_vp9_frame_context *probs,
+				  const struct v4l2_ctrl_vp9_compressed_hdr *deltas)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(probs->is_inter); i++)
+		probs->is_inter[i] = update_prob(deltas->is_inter[i], probs->is_inter[i]);
+}
+
+/* 6.3.12 frame_reference_mode() done entirely in userspace */
+
+/* Counterpart to 6.3.13 frame_reference_mode_probs() */
+static void
+update_frame_reference_mode_probs(unsigned int reference_mode,
+				  struct v4l2_vp9_frame_context *probs,
+				  const struct v4l2_ctrl_vp9_compressed_hdr *deltas)
+{
+	int i;
+
+	if (reference_mode == V4L2_VP9_REFERENCE_MODE_SELECT)
+		for (i = 0; i < ARRAY_SIZE(probs->comp_mode); i++)
+			probs->comp_mode[i] = update_prob(deltas->comp_mode[i],
+							  probs->comp_mode[i]);
+
+	if (reference_mode != V4L2_VP9_REFERENCE_MODE_COMPOUND_REFERENCE)
+		for (i = 0; i < ARRAY_SIZE(probs->single_ref); i++) {
+			u8 *p = probs->single_ref[i];
+			const u8 *d = deltas->single_ref[i];
+
+			p[0] = update_prob(d[0], p[0]);
+			p[1] = update_prob(d[1], p[1]);
+		}
+
+	if (reference_mode != V4L2_VP9_REFERENCE_MODE_SINGLE_REFERENCE)
+		for (i = 0; i < ARRAY_SIZE(probs->comp_ref); i++)
+			probs->comp_ref[i] = update_prob(deltas->comp_ref[i], probs->comp_ref[i]);
+}
+
+/* Counterpart to 6.3.14 read_y_mode_probs() */
+static void update_y_mode_probs(struct v4l2_vp9_frame_context *probs,
+				const struct v4l2_ctrl_vp9_compressed_hdr *deltas)
+{
+	int i, j;
+
+	for (i = 0; i < ARRAY_SIZE(probs->y_mode); i++)
+		for (j = 0; j < ARRAY_SIZE(probs->y_mode[0]); ++j)
+			probs->y_mode[i][j] =
+				update_prob(deltas->y_mode[i][j], probs->y_mode[i][j]);
+}
+
+/* Counterpart to 6.3.15 read_partition_probs() */
+static void update_partition_probs(struct v4l2_vp9_frame_context *probs,
+				   const struct v4l2_ctrl_vp9_compressed_hdr *deltas)
+{
+	int i, j;
+
+	for (i = 0; i < 4; i++)
+		for (j = 0; j < 4; j++) {
+			u8 *p = probs->partition[i * 4 + j];
+			const u8 *d = deltas->partition[i * 4 + j];
+
+			p[0] = update_prob(d[0], p[0]);
+			p[1] = update_prob(d[1], p[1]);
+			p[2] = update_prob(d[2], p[2]);
+		}
+}
+
+static inline int update_mv_prob(int delta, int prob)
+{
+	if (!delta)
+		return prob;
+
+	return delta;
+}
+
+/* Counterpart to 6.3.16 mv_probs() */
+static void update_mv_probs(struct v4l2_vp9_frame_context *probs,
+			    const struct v4l2_ctrl_vp9_compressed_hdr *deltas,
+			    const struct v4l2_ctrl_vp9_frame *dec_params)
+{
+	u8 *p = probs->mv.joint;
+	const u8 *d = deltas->mv.joint;
+	unsigned int i, j;
+
+	p[0] = update_mv_prob(d[0], p[0]);
+	p[1] = update_mv_prob(d[1], p[1]);
+	p[2] = update_mv_prob(d[2], p[2]);
+
+	for (i = 0; i < ARRAY_SIZE(probs->mv.sign); i++) {
+		p = probs->mv.sign;
+		d = deltas->mv.sign;
+		p[i] = update_mv_prob(d[i], p[i]);
+
+		p = probs->mv.classes[i];
+		d = deltas->mv.classes[i];
+		for (j = 0; j < ARRAY_SIZE(probs->mv.classes[0]); j++)
+			p[j] = update_mv_prob(d[j], p[j]);
+
+		p = probs->mv.class0_bit;
+		d = deltas->mv.class0_bit;
+		p[i] = update_mv_prob(d[i], p[i]);
+
+		p = probs->mv.bits[i];
+		d = deltas->mv.bits[i];
+		for (j = 0; j < ARRAY_SIZE(probs->mv.bits[0]); j++)
+			p[j] = update_mv_prob(d[j], p[j]);
+
+		for (j = 0; j < ARRAY_SIZE(probs->mv.class0_fr[0]); j++) {
+			p = probs->mv.class0_fr[i][j];
+			d = deltas->mv.class0_fr[i][j];
+
+			p[0] = update_mv_prob(d[0], p[0]);
+			p[1] = update_mv_prob(d[1], p[1]);
+			p[2] = update_mv_prob(d[2], p[2]);
+		}
+
+		p = probs->mv.fr[i];
+		d = deltas->mv.fr[i];
+		for (j = 0; j < ARRAY_SIZE(probs->mv.fr[i]); j++)
+			p[j] = update_mv_prob(d[j], p[j]);
+
+		if (dec_params->flags & V4L2_VP9_FRAME_FLAG_ALLOW_HIGH_PREC_MV) {
+			p = probs->mv.class0_hp;
+			d = deltas->mv.class0_hp;
+			p[i] = update_mv_prob(d[i], p[i]);
+
+			p = probs->mv.hp;
+			d = deltas->mv.hp;
+			p[i] = update_mv_prob(d[i], p[i]);
+		}
+	}
+}
+
+/* Counterpart to 6.3 compressed_header(), but parsing has been done in userspace. */
+void v4l2_vp9_fw_update_probs(struct v4l2_vp9_frame_context *probs,
+			      const struct v4l2_ctrl_vp9_compressed_hdr *deltas,
+			      const struct v4l2_ctrl_vp9_frame *dec_params)
+{
+	if (deltas->tx_mode == V4L2_VP9_TX_MODE_SELECT)
+		update_tx_probs(probs, deltas);
+
+	update_coef_probs(probs, deltas, dec_params);
+
+	update_skip_probs(probs, deltas);
+
+	if (dec_params->flags & V4L2_VP9_FRAME_FLAG_KEY_FRAME ||
+	    dec_params->flags & V4L2_VP9_FRAME_FLAG_INTRA_ONLY)
+		return;
+
+	update_inter_mode_probs(probs, deltas);
+
+	if (dec_params->interpolation_filter == V4L2_VP9_INTERP_FILTER_SWITCHABLE)
+		update_interp_filter_probs(probs, deltas);
+
+	update_is_inter_probs(probs, deltas);
+
+	update_frame_reference_mode_probs(dec_params->reference_mode, probs, deltas);
+
+	update_y_mode_probs(probs, deltas);
+
+	update_partition_probs(probs, deltas);
+
+	update_mv_probs(probs, deltas, dec_params);
+}
+EXPORT_SYMBOL_GPL(v4l2_vp9_fw_update_probs);
+
+u8 v4l2_vp9_reset_frame_ctx(const struct v4l2_ctrl_vp9_frame *dec_params,
+			    struct v4l2_vp9_frame_context *frame_context)
+{
+	int i;
+
+	u8 fctx_idx = dec_params->frame_context_idx;
+
+	if (dec_params->flags & V4L2_VP9_FRAME_FLAG_KEY_FRAME ||
+	    dec_params->flags & V4L2_VP9_FRAME_FLAG_INTRA_ONLY ||
+	    dec_params->flags & V4L2_VP9_FRAME_FLAG_ERROR_RESILIENT) {
+		/*
+		 * setup_past_independence()
+		 * We do nothing here. Instead of storing default probs in some intermediate
+		 * location and then copying from that location to appropriate contexts
+		 * in save_probs() below, we skip that step and save default probs directly
+		 * to appropriate contexts.
+		 */
+		if (dec_params->flags & V4L2_VP9_FRAME_FLAG_KEY_FRAME ||
+		    dec_params->flags & V4L2_VP9_FRAME_FLAG_ERROR_RESILIENT ||
+		    dec_params->reset_frame_context == V4L2_VP9_RESET_FRAME_CTX_ALL)
+			for (i = 0; i < 4; ++i)
+				/* save_probs(i) */
+				memcpy(&frame_context[i], &v4l2_vp9_default_probs,
+				       sizeof(v4l2_vp9_default_probs));
+		else if (dec_params->reset_frame_context == V4L2_VP9_RESET_FRAME_CTX_SPEC)
+			/* save_probs(fctx_idx) */
+			memcpy(&frame_context[fctx_idx], &v4l2_vp9_default_probs,
+			       sizeof(v4l2_vp9_default_probs));
+		fctx_idx = 0;
+	}
+
+	return fctx_idx;
+}
+EXPORT_SYMBOL_GPL(v4l2_vp9_reset_frame_ctx);
+
+/* 8.4.1 Merge prob process */
+static u8 merge_prob(u8 pre_prob, u32 ct0, u32 ct1, u16 count_sat, u32 max_update_factor)
+{
+	u32 den, prob, count, factor;
+
+	den = ct0 + ct1;
+	if (!den) {
+		/*
+		 * prob = 128, count = 0, update_factor = 0
+		 * Round2's argument: pre_prob * 256
+		 * (pre_prob * 256 + 128) >> 8 == pre_prob
+		 */
+		return pre_prob;
+	}
+
+	prob = clamp(((ct0 << 8) + (den >> 1)) / den, (u32)1, (u32)255);
+	count = min_t(u32, den, count_sat);
+	factor = fastdiv(max_update_factor * count, count_sat);
+
+	/*
+	 * Round2(pre_prob * (256 - factor) + prob * factor, 8)
+	 * Round2(pre_prob * 256 + (prob - pre_prob) * factor, 8)
+	 * (pre_prob * 256 >> 8) + (((prob - pre_prob) * factor + 128) >> 8)
+	 */
+	return pre_prob + (((prob - pre_prob) * factor + 128) >> 8);
+}
+
+static inline u8 noncoef_merge_prob(u8 pre_prob, u32 ct0, u32 ct1)
+{
+	return merge_prob(pre_prob, ct0, ct1, 20, 128);
+}
+
+/* 8.4.2 Merge probs process */
+/*
+ * merge_probs() is a recursive function in the spec. We avoid recursion in the kernel.
+ * That said, the "tree" parameter of merge_probs() controls how deep the recursion goes.
+ * It turns out that in all cases the recursive calls boil down to a short-ish series
+ * of merge_prob() invocations (note no "s").
+ *
+ * Variant A
+ * ---------
+ * merge_probs(small_token_tree, 2):
+ *	merge_prob(p[1], c[0], c[1] + c[2])
+ *	merge_prob(p[2], c[1], c[2])
+ *
+ * Variant B
+ * ---------
+ * merge_probs(binary_tree, 0) or
+ * merge_probs(tx_size_8_tree, 0):
+ *	merge_prob(p[0], c[0], c[1])
+ *
+ * Variant C
+ * ---------
+ * merge_probs(inter_mode_tree, 0):
+ *	merge_prob(p[0], c[2], c[1] + c[0] + c[3])
+ *	merge_prob(p[1], c[0], c[1] + c[3])
+ *	merge_prob(p[2], c[1], c[3])
+ *
+ * Variant D
+ * ---------
+ * merge_probs(intra_mode_tree, 0):
+ *	merge_prob(p[0], c[0], c[1] + ... + c[9])
+ *	merge_prob(p[1], c[9], c[1] + ... + c[8])
+ *	merge_prob(p[2], c[1], c[2] + ... + c[8])
+ *	merge_prob(p[3], c[2] + c[4] + c[5], c[3] + c[8] + c[6] + c[7])
+ *	merge_prob(p[4], c[2], c[4] + c[5])
+ *	merge_prob(p[5], c[4], c[5])
+ *	merge_prob(p[6], c[3], c[8] + c[6] + c[7])
+ *	merge_prob(p[7], c[8], c[6] + c[7])
+ *	merge_prob(p[8], c[6], c[7])
+ *
+ * Variant E
+ * ---------
+ * merge_probs(partition_tree, 0) or
+ * merge_probs(tx_size_32_tree, 0) or
+ * merge_probs(mv_joint_tree, 0) or
+ * merge_probs(mv_fr_tree, 0):
+ *	merge_prob(p[0], c[0], c[1] + c[2] + c[3])
+ *	merge_prob(p[1], c[1], c[2] + c[3])
+ *	merge_prob(p[2], c[2], c[3])
+ *
+ * Variant F
+ * ---------
+ * merge_probs(interp_filter_tree, 0) or
+ * merge_probs(tx_size_16_tree, 0):
+ *	merge_prob(p[0], c[0], c[1] + c[2])
+ *	merge_prob(p[1], c[1], c[2])
+ *
+ * Variant G
+ * ---------
+ * merge_probs(mv_class_tree, 0):
+ *	merge_prob(p[0], c[0], c[1] + ... + c[10])
+ *	merge_prob(p[1], c[1], c[2] + ... + c[10])
+ *	merge_prob(p[2], c[2] + c[3], c[4] + ... + c[10])
+ *	merge_prob(p[3], c[2], c[3])
+ *	merge_prob(p[4], c[4] + c[5], c[6] + ... + c[10])
+ *	merge_prob(p[5], c[4], c[5])
+ *	merge_prob(p[6], c[6], c[7] + ... + c[10])
+ *	merge_prob(p[7], c[7] + c[8], c[9] + c[10])
+ *	merge_prob(p[8], c[7], c[8])
+ *	merge_prob(p[9], c[9], [10])
+ */
+
+static inline void merge_probs_variant_a(u8 *p, const u32 *c, u16 count_sat, u32 update_factor)
+{
+	p[1] = merge_prob(p[1], c[0], c[1] + c[2], count_sat, update_factor);
+	p[2] = merge_prob(p[2], c[1], c[2], count_sat, update_factor);
+}
+
+static inline void merge_probs_variant_b(u8 *p, const u32 *c, u16 count_sat, u32 update_factor)
+{
+	p[0] = merge_prob(p[0], c[0], c[1], count_sat, update_factor);
+}
+
+static inline void merge_probs_variant_c(u8 *p, const u32 *c)
+{
+	p[0] = noncoef_merge_prob(p[0], c[2], c[1] + c[0] + c[3]);
+	p[1] = noncoef_merge_prob(p[1], c[0], c[1] + c[3]);
+	p[2] = noncoef_merge_prob(p[2], c[1], c[3]);
+}
+
+static void merge_probs_variant_d(u8 *p, const u32 *c)
+{
+	u32 sum = 0, s2;
+
+	sum = c[1] + c[2] + c[3] + c[4] + c[5] + c[6] + c[7] + c[8] + c[9];
+
+	p[0] = noncoef_merge_prob(p[0], c[0], sum);
+	sum -= c[9];
+	p[1] = noncoef_merge_prob(p[1], c[9], sum);
+	sum -= c[1];
+	p[2] = noncoef_merge_prob(p[2], c[1], sum);
+	s2 = c[2] + c[4] + c[5];
+	sum -= s2;
+	p[3] = noncoef_merge_prob(p[3], s2, sum);
+	s2 -= c[2];
+	p[4] = noncoef_merge_prob(p[4], c[2], s2);
+	p[5] = noncoef_merge_prob(p[5], c[4], c[5]);
+	sum -= c[3];
+	p[6] = noncoef_merge_prob(p[6], c[3], sum);
+	sum -= c[8];
+	p[7] = noncoef_merge_prob(p[7], c[8], sum);
+	p[8] = noncoef_merge_prob(p[8], c[6], c[7]);
+}
+
+static inline void merge_probs_variant_e(u8 *p, const u32 *c)
+{
+	p[0] = noncoef_merge_prob(p[0], c[0], c[1] + c[2] + c[3]);
+	p[1] = noncoef_merge_prob(p[1], c[1], c[2] + c[3]);
+	p[2] = noncoef_merge_prob(p[2], c[2], c[3]);
+}
+
+static inline void merge_probs_variant_f(u8 *p, const u32 *c)
+{
+	p[0] = noncoef_merge_prob(p[0], c[0], c[1] + c[2]);
+	p[1] = noncoef_merge_prob(p[1], c[1], c[2]);
+}
+
+static void merge_probs_variant_g(u8 *p, const u32 *c)
+{
+	u32 sum;
+
+	sum = c[1] + c[2] + c[3] + c[4] + c[5] + c[6] + c[7] + c[8] + c[9] + c[10];
+	p[0] = noncoef_merge_prob(p[0], c[0], sum);
+	sum -= c[1];
+	p[1] = noncoef_merge_prob(p[1], c[1], sum);
+	sum -= c[2] + c[3];
+	p[2] = noncoef_merge_prob(p[2], c[2] + c[3], sum);
+	p[3] = noncoef_merge_prob(p[3], c[2], c[3]);
+	sum -= c[4] + c[5];
+	p[4] = noncoef_merge_prob(p[4], c[4] + c[5], sum);
+	p[5] = noncoef_merge_prob(p[5], c[4], c[5]);
+	sum -= c[6];
+	p[6] = noncoef_merge_prob(p[6], c[6], sum);
+	p[7] = noncoef_merge_prob(p[7], c[7] + c[8], c[9] + c[10]);
+	p[8] = noncoef_merge_prob(p[8], c[7], c[8]);
+	p[9] = noncoef_merge_prob(p[9], c[9], c[10]);
+}
+
+/* 8.4.3 Coefficient probability adaptation process */
+static inline void adapt_probs_variant_a_coef(u8 *p, const u32 *c, u32 update_factor)
+{
+	merge_probs_variant_a(p, c, 24, update_factor);
+}
+
+static inline void adapt_probs_variant_b_coef(u8 *p, const u32 *c, u32 update_factor)
+{
+	merge_probs_variant_b(p, c, 24, update_factor);
+}
+
+static void _adapt_coeff(unsigned int i, unsigned int j, unsigned int k,
+			 struct v4l2_vp9_frame_context *probs,
+			 const struct v4l2_vp9_frame_symbol_counts *counts,
+			 u32 uf)
+{
+	s32 l, m;
+
+	for (l = 0; l < ARRAY_SIZE(probs->coef[0][0][0]); l++) {
+		for (m = 0; m < BAND_6(l); m++) {
+			u8 *p = probs->coef[i][j][k][l][m];
+			const u32 counts_more_coefs[2] = {
+				*counts->eob[i][j][k][l][m][1],
+				*counts->eob[i][j][k][l][m][0] - *counts->eob[i][j][k][l][m][1],
+			};
+
+			adapt_probs_variant_a_coef(p, *counts->coeff[i][j][k][l][m], uf);
+			adapt_probs_variant_b_coef(p, counts_more_coefs, uf);
+		}
+	}
+}
+
+static void _adapt_coef_probs(struct v4l2_vp9_frame_context *probs,
+			      const struct v4l2_vp9_frame_symbol_counts *counts,
+			      unsigned int uf)
+{
+	unsigned int i, j, k;
+
+	for (i = 0; i < ARRAY_SIZE(probs->coef); i++)
+		for (j = 0; j < ARRAY_SIZE(probs->coef[0]); j++)
+			for (k = 0; k < ARRAY_SIZE(probs->coef[0][0]); k++)
+				_adapt_coeff(i, j, k, probs, counts, uf);
+}
+
+void v4l2_vp9_adapt_coef_probs(struct v4l2_vp9_frame_context *probs,
+			       struct v4l2_vp9_frame_symbol_counts *counts,
+			       bool use_128,
+			       bool frame_is_intra)
+{
+	if (frame_is_intra) {
+		_adapt_coef_probs(probs, counts, 112);
+	} else {
+		if (use_128)
+			_adapt_coef_probs(probs, counts, 128);
+		else
+			_adapt_coef_probs(probs, counts, 112);
+	}
+}
+EXPORT_SYMBOL_GPL(v4l2_vp9_adapt_coef_probs);
+
+/* 8.4.4 Non coefficient probability adaptation process, adapt_probs() */
+static inline void adapt_probs_variant_b(u8 *p, const u32 *c)
+{
+	merge_probs_variant_b(p, c, 20, 128);
+}
+
+static inline void adapt_probs_variant_c(u8 *p, const u32 *c)
+{
+	merge_probs_variant_c(p, c);
+}
+
+static inline void adapt_probs_variant_d(u8 *p, const u32 *c)
+{
+	merge_probs_variant_d(p, c);
+}
+
+static inline void adapt_probs_variant_e(u8 *p, const u32 *c)
+{
+	merge_probs_variant_e(p, c);
+}
+
+static inline void adapt_probs_variant_f(u8 *p, const u32 *c)
+{
+	merge_probs_variant_f(p, c);
+}
+
+static inline void adapt_probs_variant_g(u8 *p, const u32 *c)
+{
+	merge_probs_variant_g(p, c);
+}
+
+/* 8.4.4 Non coefficient probability adaptation process, adapt_prob() */
+static inline u8 adapt_prob(u8 prob, const u32 counts[2])
+{
+	return noncoef_merge_prob(prob, counts[0], counts[1]);
+}
+
+/* 8.4.4 Non coefficient probability adaptation process */
+void v4l2_vp9_adapt_noncoef_probs(struct v4l2_vp9_frame_context *probs,
+				  struct v4l2_vp9_frame_symbol_counts *counts,
+				  u8 reference_mode, u8 interpolation_filter, u8 tx_mode,
+				  u32 flags)
+{
+	unsigned int i, j;
+
+	for (i = 0; i < ARRAY_SIZE(probs->is_inter); i++)
+		probs->is_inter[i] = adapt_prob(probs->is_inter[i], (*counts->intra_inter)[i]);
+
+	for (i = 0; i < ARRAY_SIZE(probs->comp_mode); i++)
+		probs->comp_mode[i] = adapt_prob(probs->comp_mode[i], (*counts->comp)[i]);
+
+	for (i = 0; i < ARRAY_SIZE(probs->comp_ref); i++)
+		probs->comp_ref[i] = adapt_prob(probs->comp_ref[i], (*counts->comp_ref)[i]);
+
+	if (reference_mode != V4L2_VP9_REFERENCE_MODE_COMPOUND_REFERENCE)
+		for (i = 0; i < ARRAY_SIZE(probs->single_ref); i++)
+			for (j = 0; j < ARRAY_SIZE(probs->single_ref[0]); j++)
+				probs->single_ref[i][j] = adapt_prob(probs->single_ref[i][j],
+								     (*counts->single_ref)[i][j]);
+
+	for (i = 0; i < ARRAY_SIZE(probs->inter_mode); i++)
+		adapt_probs_variant_c(probs->inter_mode[i], (*counts->mv_mode)[i]);
+
+	for (i = 0; i < ARRAY_SIZE(probs->y_mode); i++)
+		adapt_probs_variant_d(probs->y_mode[i], (*counts->y_mode)[i]);
+
+	for (i = 0; i < ARRAY_SIZE(probs->uv_mode); i++)
+		adapt_probs_variant_d(probs->uv_mode[i], (*counts->uv_mode)[i]);
+
+	for (i = 0; i < ARRAY_SIZE(probs->partition); i++)
+		adapt_probs_variant_e(probs->partition[i], (*counts->partition)[i]);
+
+	for (i = 0; i < ARRAY_SIZE(probs->skip); i++)
+		probs->skip[i] = adapt_prob(probs->skip[i], (*counts->skip)[i]);
+
+	if (interpolation_filter == V4L2_VP9_INTERP_FILTER_SWITCHABLE)
+		for (i = 0; i < ARRAY_SIZE(probs->interp_filter); i++)
+			adapt_probs_variant_f(probs->interp_filter[i], (*counts->filter)[i]);
+
+	if (tx_mode == V4L2_VP9_TX_MODE_SELECT)
+		for (i = 0; i < ARRAY_SIZE(probs->tx8); i++) {
+			adapt_probs_variant_b(probs->tx8[i], (*counts->tx8p)[i]);
+			adapt_probs_variant_f(probs->tx16[i], (*counts->tx16p)[i]);
+			adapt_probs_variant_e(probs->tx32[i], (*counts->tx32p)[i]);
+		}
+
+	adapt_probs_variant_e(probs->mv.joint, *counts->mv_joint);
+
+	for (i = 0; i < ARRAY_SIZE(probs->mv.sign); i++) {
+		probs->mv.sign[i] = adapt_prob(probs->mv.sign[i], (*counts->sign)[i]);
+
+		adapt_probs_variant_g(probs->mv.classes[i], (*counts->classes)[i]);
+
+		probs->mv.class0_bit[i] = adapt_prob(probs->mv.class0_bit[i], (*counts->class0)[i]);
+
+		for (j = 0; j < ARRAY_SIZE(probs->mv.bits[0]); j++)
+			probs->mv.bits[i][j] = adapt_prob(probs->mv.bits[i][j],
+							  (*counts->bits)[i][j]);
+
+		for (j = 0; j < ARRAY_SIZE(probs->mv.class0_fr[0]); j++)
+			adapt_probs_variant_e(probs->mv.class0_fr[i][j],
+					      (*counts->class0_fp)[i][j]);
+
+		adapt_probs_variant_e(probs->mv.fr[i], (*counts->fp)[i]);
+
+		if (!(flags & V4L2_VP9_FRAME_FLAG_ALLOW_HIGH_PREC_MV))
+			continue;
+
+		probs->mv.class0_hp[i] = adapt_prob(probs->mv.class0_hp[i],
+						    (*counts->class0_hp)[i]);
+
+		probs->mv.hp[i] = adapt_prob(probs->mv.hp[i], (*counts->hp)[i]);
+	}
+}
+EXPORT_SYMBOL_GPL(v4l2_vp9_adapt_noncoef_probs);
+
+bool
+v4l2_vp9_seg_feat_enabled(const u8 *feature_enabled,
+			  unsigned int feature,
+			  unsigned int segid)
+{
+	u8 mask = V4L2_VP9_SEGMENT_FEATURE_ENABLED(feature);
+
+	return !!(feature_enabled[segid] & mask);
+}
+EXPORT_SYMBOL_GPL(v4l2_vp9_seg_feat_enabled);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("V4L2 VP9 Helpers");
+MODULE_AUTHOR("Andrzej Pietrasiewicz <andrzej.p@collabora.com>");
diff --git a/include/media/v4l2-vp9.h b/include/media/v4l2-vp9.h
new file mode 100644
index 000000000000..3415608dbc7c
--- /dev/null
+++ b/include/media/v4l2-vp9.h
@@ -0,0 +1,182 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Helper functions for vp9 codecs.
+ *
+ * Copyright (c) 2021 Collabora, Ltd.
+ *
+ * Author: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
+ */
+
+#ifndef _MEDIA_V4L2_VP9_H
+#define _MEDIA_V4L2_VP9_H
+
+#include <media/v4l2-ctrls.h>
+
+/**
+ * struct v4l2_vp9_frame_mv_context - motion vector-related probabilities
+ *
+ * A member of v4l2_vp9_frame_context.
+ */
+struct v4l2_vp9_frame_mv_context {
+	u8 joint[3];
+	u8 sign[2];
+	u8 classes[2][10];
+	u8 class0_bit[2];
+	u8 bits[2][10];
+	u8 class0_fr[2][2][3];
+	u8 fr[2][3];
+	u8 class0_hp[2];
+	u8 hp[2];
+};
+
+/**
+ * struct v4l2_vp9_frame_context - frame probabilities, including motion-vector related
+ *
+ * Drivers which need to keep track of frame context(s) can use this struct.
+ * The members correspond to probability tables, which are specified only implicitly in the
+ * vp9 spec. Section 10.5 "Default probability tables" contains all the types of involved
+ * tables, i.e. the actual tables are of the same kind, and when they are reset (which is
+ * mandated by the spec sometimes) they are overwritten with values from the default tables.
+ */
+struct v4l2_vp9_frame_context {
+	u8 tx8[2][1];
+	u8 tx16[2][2];
+	u8 tx32[2][3];
+	u8 coef[4][2][2][6][6][3];
+	u8 skip[3];
+	u8 inter_mode[7][3];
+	u8 interp_filter[4][2];
+	u8 is_inter[4];
+	u8 comp_mode[5];
+	u8 single_ref[5][2];
+	u8 comp_ref[5];
+	u8 y_mode[4][9];
+	u8 uv_mode[10][9];
+	u8 partition[16][3];
+
+	struct v4l2_vp9_frame_mv_context mv;
+};
+
+/**
+ * struct v4l2_vp9_frame_symbol_counts - pointers to arrays of symbol counts
+ *
+ * The fields correspond to what is specified in section 8.3 "Clear counts process" of the spec.
+ * Different pieces of hardware can report the counts in different order, so we cannot rely on
+ * simply overlaying a struct on a relevant block of memory. Instead we provide pointers to
+ * arrays or array of pointers to arrays in case of coeff, or array of pointers for eob.
+ */
+struct v4l2_vp9_frame_symbol_counts {
+	u32 (*partition)[16][4];
+	u32 (*skip)[3][2];
+	u32 (*intra_inter)[4][2];
+	u32 (*tx32p)[2][4];
+	u32 (*tx16p)[2][4];
+	u32 (*tx8p)[2][2];
+	u32 (*y_mode)[4][10];
+	u32 (*uv_mode)[10][10];
+	u32 (*comp)[5][2];
+	u32 (*comp_ref)[5][2];
+	u32 (*single_ref)[5][2][2];
+	u32 (*mv_mode)[7][4];
+	u32 (*filter)[4][3];
+	u32 (*mv_joint)[4];
+	u32 (*sign)[2][2];
+	u32 (*classes)[2][11];
+	u32 (*class0)[2][2];
+	u32 (*bits)[2][10][2];
+	u32 (*class0_fp)[2][2][4];
+	u32 (*fp)[2][4];
+	u32 (*class0_hp)[2][2];
+	u32 (*hp)[2][2];
+	u32 (*coeff[4][2][2][6][6])[3];
+	u32 *eob[4][2][2][6][6][2];
+};
+
+extern const u8 v4l2_vp9_kf_y_mode_prob[10][10][9]; /* Section 10.4 of the spec */
+extern const u8 v4l2_vp9_kf_partition_probs[16][3]; /* Section 10.4 of the spec */
+extern const u8 v4l2_vp9_kf_uv_mode_prob[10][9]; /* Section 10.4 of the spec */
+extern const struct v4l2_vp9_frame_context v4l2_vp9_default_probs; /* Section 10.5 of the spec */
+
+/**
+ * v4l2_vp9_fw_update_probs() - Perform forward update of vp9 probabilities
+ *
+ * @probs: current probabilities values
+ * @deltas: delta values from compressed header
+ * @dec_params: vp9 frame decoding parameters
+ *
+ * This function performs forward updates of probabilities for the vp9 boolean decoder.
+ * The frame header can contain a directive to update the probabilities (deltas), if so, then
+ * the deltas are provided in the header, too. The userspace parses those and passes the said
+ * deltas struct to the kernel.
+ */
+void v4l2_vp9_fw_update_probs(struct v4l2_vp9_frame_context *probs,
+			      const struct v4l2_ctrl_vp9_compressed_hdr *deltas,
+			      const struct v4l2_ctrl_vp9_frame *dec_params);
+
+/**
+ * v4l2_vp9_reset_frame_ctx() - Reset appropriate frame context
+ *
+ * @dec_params: vp9 frame decoding parameters
+ * @frame_context: array of the 4 frame contexts
+ *
+ * This function resets appropriate frame contexts, based on what's in dec_params.
+ *
+ * Returns the frame context index after the update, which might be reset to zero if
+ * mandated by the spec.
+ */
+u8 v4l2_vp9_reset_frame_ctx(const struct v4l2_ctrl_vp9_frame *dec_params,
+			    struct v4l2_vp9_frame_context *frame_context);
+
+/**
+ * v4l2_vp9_adapt_coef_probs() - Perform backward update of vp9 coefficients probabilities
+ *
+ * @probs: current probabilities values
+ * @counts: values of symbol counts after the current frame has been decoded
+ * @use_128: flag to request that 128 is used as update factor if true, otherwise 112 is used
+ * @frame_is_intra: flag indicating that FrameIsIntra is true
+ *
+ * This function performs backward updates of coefficients probabilities for the vp9 boolean
+ * decoder. After a frame has been decoded the counts of how many times a given symbol has
+ * occurred are known and are used to update the probability of each symbol.
+ */
+void v4l2_vp9_adapt_coef_probs(struct v4l2_vp9_frame_context *probs,
+			       struct v4l2_vp9_frame_symbol_counts *counts,
+			       bool use_128,
+			       bool frame_is_intra);
+
+/**
+ * v4l2_vp9_adapt_coef_probs() - Perform backward update of vp9 non-coefficients probabilities
+ *
+ * @probs: current probabilities values
+ * @counts: values of symbol counts after the current frame has been decoded
+ * @reference_mode: specifies the type of inter prediction to be used. See
+ *	&v4l2_vp9_reference_mode for more details
+ * @interpolation_filter: specifies the filter selection used for performing inter prediction.
+ *	See &v4l2_vp9_interpolation_filter for more details
+ * @tx_mode: specifies the TX mode. See &v4l2_vp9_tx_mode for more details
+ * @flags: combination of V4L2_VP9_FRAME_FLAG_* flags
+ *
+ * This function performs backward updates of non-coefficients probabilities for the vp9 boolean
+ * decoder. After a frame has been decoded the counts of how many times a given symbol has
+ * occurred are known and are used to update the probability of each symbol.
+ */
+void v4l2_vp9_adapt_noncoef_probs(struct v4l2_vp9_frame_context *probs,
+				  struct v4l2_vp9_frame_symbol_counts *counts,
+				  u8 reference_mode, u8 interpolation_filter, u8 tx_mode,
+				  u32 flags);
+
+/**
+ * v4l2_vp9_seg_feat_enabled() - Check if a segmentation feature is enabled
+ *
+ * @feature_enabled: array of 8-bit flags (for all segments)
+ * @feature: id of the feature to check
+ * @segid: id of the segment to look up
+ *
+ * This function returns true if a given feature is active in a given segment.
+ */
+bool
+v4l2_vp9_seg_feat_enabled(const u8 *feature_enabled,
+			  unsigned int feature,
+			  unsigned int segid);
+
+#endif /* _MEDIA_V4L2_VP9_H */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v7 07/11] media: rkvdec: Add the VP9 backend
  2021-09-29 16:04 [PATCH v7 00/11] VP9 codec V4L2 control interface Andrzej Pietrasiewicz
                   ` (5 preceding siblings ...)
  2021-09-29 16:04 ` [PATCH v7 06/11] media: Add VP9 v4l2 library Andrzej Pietrasiewicz
@ 2021-09-29 16:04 ` Andrzej Pietrasiewicz
  2021-10-08 10:30   ` Chen-Yu Tsai
  2021-10-19 23:24   ` Alex Bee
  2021-09-29 16:04 ` [PATCH v7 08/11] media: hantro: Rename registers Andrzej Pietrasiewicz
                   ` (6 subsequent siblings)
  13 siblings, 2 replies; 37+ messages in thread
From: Andrzej Pietrasiewicz @ 2021-09-29 16:04 UTC (permalink / raw)
  To: linux-media, linux-arm-kernel, linux-kernel, linux-rockchip,
	linux-staging
  Cc: Andrzej Pietrasiewicz, Benjamin Gaignard, Boris Brezillon,
	Ezequiel Garcia, Fabio Estevam, Greg Kroah-Hartman, Hans Verkuil,
	Heiko Stuebner, Jernej Skrabec, Mauro Carvalho Chehab,
	Nicolas Dufresne, NXP Linux Team, Pengutronix Kernel Team,
	Philipp Zabel, Sascha Hauer, Shawn Guo, kernel, Ezequiel Garcia,
	Adrian Ratiu

From: Boris Brezillon <boris.brezillon@collabora.com>

The Rockchip VDEC supports VP9 profile 0 up to 4096x2304@30fps. Add
a backend for this new format.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Co-developed-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
---
 drivers/staging/media/rkvdec/Kconfig      |    1 +
 drivers/staging/media/rkvdec/Makefile     |    2 +-
 drivers/staging/media/rkvdec/rkvdec-vp9.c | 1078 +++++++++++++++++++++
 drivers/staging/media/rkvdec/rkvdec.c     |   52 +-
 drivers/staging/media/rkvdec/rkvdec.h     |   12 +-
 5 files changed, 1137 insertions(+), 8 deletions(-)
 create mode 100644 drivers/staging/media/rkvdec/rkvdec-vp9.c

diff --git a/drivers/staging/media/rkvdec/Kconfig b/drivers/staging/media/rkvdec/Kconfig
index c02199b5e0fd..dc7292f346fa 100644
--- a/drivers/staging/media/rkvdec/Kconfig
+++ b/drivers/staging/media/rkvdec/Kconfig
@@ -9,6 +9,7 @@ config VIDEO_ROCKCHIP_VDEC
 	select VIDEOBUF2_VMALLOC
 	select V4L2_MEM2MEM_DEV
 	select V4L2_H264
+	select V4L2_VP9
 	help
 	  Support for the Rockchip Video Decoder IP present on Rockchip SoCs,
 	  which accelerates video decoding.
diff --git a/drivers/staging/media/rkvdec/Makefile b/drivers/staging/media/rkvdec/Makefile
index c08fed0a39f9..cb86b429cfaa 100644
--- a/drivers/staging/media/rkvdec/Makefile
+++ b/drivers/staging/media/rkvdec/Makefile
@@ -1,3 +1,3 @@
 obj-$(CONFIG_VIDEO_ROCKCHIP_VDEC) += rockchip-vdec.o
 
-rockchip-vdec-y += rkvdec.o rkvdec-h264.o
+rockchip-vdec-y += rkvdec.o rkvdec-h264.o rkvdec-vp9.o
diff --git a/drivers/staging/media/rkvdec/rkvdec-vp9.c b/drivers/staging/media/rkvdec/rkvdec-vp9.c
new file mode 100644
index 000000000000..ca463f18651a
--- /dev/null
+++ b/drivers/staging/media/rkvdec/rkvdec-vp9.c
@@ -0,0 +1,1078 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Rockchip Video Decoder VP9 backend
+ *
+ * Copyright (C) 2019 Collabora, Ltd.
+ *	Boris Brezillon <boris.brezillon@collabora.com>
+ * Copyright (C) 2021 Collabora, Ltd.
+ *	Andrzej Pietrasiewicz <andrzej.p@collabora.com>
+ *
+ * Copyright (C) 2016 Rockchip Electronics Co., Ltd.
+ *	Alpha Lin <Alpha.Lin@rock-chips.com>
+ */
+
+/*
+ * For following the vp9 spec please start reading this driver
+ * code from rkvdec_vp9_run() followed by rkvdec_vp9_done().
+ */
+
+#include <linux/kernel.h>
+#include <linux/vmalloc.h>
+#include <media/v4l2-mem2mem.h>
+#include <media/v4l2-vp9.h>
+
+#include "rkvdec.h"
+#include "rkvdec-regs.h"
+
+#define RKVDEC_VP9_PROBE_SIZE		4864
+#define RKVDEC_VP9_COUNT_SIZE		13232
+#define RKVDEC_VP9_MAX_SEGMAP_SIZE	73728
+
+struct rkvdec_vp9_intra_mode_probs {
+	u8 y_mode[105];
+	u8 uv_mode[23];
+};
+
+struct rkvdec_vp9_intra_only_frame_probs {
+	u8 coef_intra[4][2][128];
+	struct rkvdec_vp9_intra_mode_probs intra_mode[10];
+};
+
+struct rkvdec_vp9_inter_frame_probs {
+	u8 y_mode[4][9];
+	u8 comp_mode[5];
+	u8 comp_ref[5];
+	u8 single_ref[5][2];
+	u8 inter_mode[7][3];
+	u8 interp_filter[4][2];
+	u8 padding0[11];
+	u8 coef[2][4][2][128];
+	u8 uv_mode_0_2[3][9];
+	u8 padding1[5];
+	u8 uv_mode_3_5[3][9];
+	u8 padding2[5];
+	u8 uv_mode_6_8[3][9];
+	u8 padding3[5];
+	u8 uv_mode_9[9];
+	u8 padding4[7];
+	u8 padding5[16];
+	struct {
+		u8 joint[3];
+		u8 sign[2];
+		u8 classes[2][10];
+		u8 class0_bit[2];
+		u8 bits[2][10];
+		u8 class0_fr[2][2][3];
+		u8 fr[2][3];
+		u8 class0_hp[2];
+		u8 hp[2];
+	} mv;
+};
+
+struct rkvdec_vp9_probs {
+	u8 partition[16][3];
+	u8 pred[3];
+	u8 tree[7];
+	u8 skip[3];
+	u8 tx32[2][3];
+	u8 tx16[2][2];
+	u8 tx8[2][1];
+	u8 is_inter[4];
+	/* 128 bit alignment */
+	u8 padding0[3];
+	union {
+		struct rkvdec_vp9_inter_frame_probs inter;
+		struct rkvdec_vp9_intra_only_frame_probs intra_only;
+	};
+};
+
+/* Data structure describing auxiliary buffer format. */
+struct rkvdec_vp9_priv_tbl {
+	struct rkvdec_vp9_probs probs;
+	u8 segmap[2][RKVDEC_VP9_MAX_SEGMAP_SIZE];
+};
+
+struct rkvdec_vp9_refs_counts {
+	u32 eob[2];
+	u32 coeff[3];
+};
+
+struct rkvdec_vp9_inter_frame_symbol_counts {
+	u32 partition[16][4];
+	u32 skip[3][2];
+	u32 inter[4][2];
+	u32 tx32p[2][4];
+	u32 tx16p[2][4];
+	u32 tx8p[2][2];
+	u32 y_mode[4][10];
+	u32 uv_mode[10][10];
+	u32 comp[5][2];
+	u32 comp_ref[5][2];
+	u32 single_ref[5][2][2];
+	u32 mv_mode[7][4];
+	u32 filter[4][3];
+	u32 mv_joint[4];
+	u32 sign[2][2];
+	/* add 1 element for align */
+	u32 classes[2][11 + 1];
+	u32 class0[2][2];
+	u32 bits[2][10][2];
+	u32 class0_fp[2][2][4];
+	u32 fp[2][4];
+	u32 class0_hp[2][2];
+	u32 hp[2][2];
+	struct rkvdec_vp9_refs_counts ref_cnt[2][4][2][6][6];
+};
+
+struct rkvdec_vp9_intra_frame_symbol_counts {
+	u32 partition[4][4][4];
+	u32 skip[3][2];
+	u32 intra[4][2];
+	u32 tx32p[2][4];
+	u32 tx16p[2][4];
+	u32 tx8p[2][2];
+	struct rkvdec_vp9_refs_counts ref_cnt[2][4][2][6][6];
+};
+
+struct rkvdec_vp9_run {
+	struct rkvdec_run base;
+	const struct v4l2_ctrl_vp9_frame *decode_params;
+};
+
+struct rkvdec_vp9_frame_info {
+	u32 valid : 1;
+	u32 segmapid : 1;
+	u32 frame_context_idx : 2;
+	u32 reference_mode : 2;
+	u32 tx_mode : 3;
+	u32 interpolation_filter : 3;
+	u32 flags;
+	u64 timestamp;
+	struct v4l2_vp9_segmentation seg;
+	struct v4l2_vp9_loop_filter lf;
+};
+
+struct rkvdec_vp9_ctx {
+	struct rkvdec_aux_buf priv_tbl;
+	struct rkvdec_aux_buf count_tbl;
+	struct v4l2_vp9_frame_symbol_counts inter_cnts;
+	struct v4l2_vp9_frame_symbol_counts intra_cnts;
+	struct v4l2_vp9_frame_context probability_tables;
+	struct v4l2_vp9_frame_context frame_context[4];
+	struct rkvdec_vp9_frame_info cur;
+	struct rkvdec_vp9_frame_info last;
+};
+
+static void write_coeff_plane(const u8 coef[6][6][3], u8 *coeff_plane)
+{
+	unsigned int idx = 0, byte_count = 0;
+	int k, m, n;
+	u8 p;
+
+	for (k = 0; k < 6; k++) {
+		for (m = 0; m < 6; m++) {
+			for (n = 0; n < 3; n++) {
+				p = coef[k][m][n];
+				coeff_plane[idx++] = p;
+				byte_count++;
+				if (byte_count == 27) {
+					idx += 5;
+					byte_count = 0;
+				}
+			}
+		}
+	}
+}
+
+static void init_intra_only_probs(struct rkvdec_ctx *ctx,
+				  const struct rkvdec_vp9_run *run)
+{
+	const struct v4l2_ctrl_vp9_frame *dec_params;
+	struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv;
+	struct rkvdec_vp9_priv_tbl *tbl = vp9_ctx->priv_tbl.cpu;
+	struct rkvdec_vp9_intra_only_frame_probs *rkprobs;
+	const struct v4l2_vp9_frame_context *probs;
+	unsigned int i, j, k, m;
+
+	rkprobs = &tbl->probs.intra_only;
+	dec_params = run->decode_params;
+	probs = &vp9_ctx->probability_tables;
+
+	/*
+	 * intra only 149 x 128 bits ,aligned to 152 x 128 bits coeff related
+	 * prob 64 x 128 bits
+	 */
+	for (i = 0; i < ARRAY_SIZE(probs->coef); i++) {
+		for (j = 0; j < ARRAY_SIZE(probs->coef[0]); j++)
+			write_coeff_plane(probs->coef[i][j][0],
+					  rkprobs->coef_intra[i][j]);
+	}
+
+	/* intra mode prob  80 x 128 bits */
+	for (i = 0; i < ARRAY_SIZE(v4l2_vp9_kf_y_mode_prob); i++) {
+		unsigned int byte_count = 0;
+		int idx = 0;
+
+		/* vp9_kf_y_mode_prob */
+		for (j = 0; j < ARRAY_SIZE(v4l2_vp9_kf_y_mode_prob[0]); j++) {
+			for (k = 0; k < ARRAY_SIZE(v4l2_vp9_kf_y_mode_prob[0][0]);
+			     k++) {
+				u8 val = v4l2_vp9_kf_y_mode_prob[i][j][k];
+
+				rkprobs->intra_mode[i].y_mode[idx++] = val;
+				byte_count++;
+				if (byte_count == 27) {
+					byte_count = 0;
+					idx += 5;
+				}
+			}
+		}
+
+		idx = 0;
+		if (i < 4) {
+			for (m = 0; m < (i < 3 ? 23 : 21); m++) {
+				const u8 *ptr = (const u8 *)v4l2_vp9_kf_uv_mode_prob;
+
+				rkprobs->intra_mode[i].uv_mode[idx++] = ptr[i * 23 + m];
+			}
+		}
+	}
+}
+
+static void init_inter_probs(struct rkvdec_ctx *ctx,
+			     const struct rkvdec_vp9_run *run)
+{
+	const struct v4l2_ctrl_vp9_frame *dec_params;
+	struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv;
+	struct rkvdec_vp9_priv_tbl *tbl = vp9_ctx->priv_tbl.cpu;
+	struct rkvdec_vp9_inter_frame_probs *rkprobs;
+	const struct v4l2_vp9_frame_context *probs;
+	unsigned int i, j, k;
+
+	rkprobs = &tbl->probs.inter;
+	dec_params = run->decode_params;
+	probs = &vp9_ctx->probability_tables;
+
+	/*
+	 * inter probs
+	 * 151 x 128 bits, aligned to 152 x 128 bits
+	 * inter only
+	 * intra_y_mode & inter_block info 6 x 128 bits
+	 */
+
+	memcpy(rkprobs->y_mode, probs->y_mode, sizeof(rkprobs->y_mode));
+	memcpy(rkprobs->comp_mode, probs->comp_mode,
+	       sizeof(rkprobs->comp_mode));
+	memcpy(rkprobs->comp_ref, probs->comp_ref,
+	       sizeof(rkprobs->comp_ref));
+	memcpy(rkprobs->single_ref, probs->single_ref,
+	       sizeof(rkprobs->single_ref));
+	memcpy(rkprobs->inter_mode, probs->inter_mode,
+	       sizeof(rkprobs->inter_mode));
+	memcpy(rkprobs->interp_filter, probs->interp_filter,
+	       sizeof(rkprobs->interp_filter));
+
+	/* 128 x 128 bits coeff related */
+	for (i = 0; i < ARRAY_SIZE(probs->coef); i++) {
+		for (j = 0; j < ARRAY_SIZE(probs->coef[0]); j++) {
+			for (k = 0; k < ARRAY_SIZE(probs->coef[0][0]); k++)
+				write_coeff_plane(probs->coef[i][j][k],
+						  rkprobs->coef[k][i][j]);
+		}
+	}
+
+	/* intra uv mode 6 x 128 */
+	memcpy(rkprobs->uv_mode_0_2, &probs->uv_mode[0],
+	       sizeof(rkprobs->uv_mode_0_2));
+	memcpy(rkprobs->uv_mode_3_5, &probs->uv_mode[3],
+	       sizeof(rkprobs->uv_mode_3_5));
+	memcpy(rkprobs->uv_mode_6_8, &probs->uv_mode[6],
+	       sizeof(rkprobs->uv_mode_6_8));
+	memcpy(rkprobs->uv_mode_9, &probs->uv_mode[9],
+	       sizeof(rkprobs->uv_mode_9));
+
+	/* mv related 6 x 128 */
+	memcpy(rkprobs->mv.joint, probs->mv.joint,
+	       sizeof(rkprobs->mv.joint));
+	memcpy(rkprobs->mv.sign, probs->mv.sign,
+	       sizeof(rkprobs->mv.sign));
+	memcpy(rkprobs->mv.classes, probs->mv.classes,
+	       sizeof(rkprobs->mv.classes));
+	memcpy(rkprobs->mv.class0_bit, probs->mv.class0_bit,
+	       sizeof(rkprobs->mv.class0_bit));
+	memcpy(rkprobs->mv.bits, probs->mv.bits,
+	       sizeof(rkprobs->mv.bits));
+	memcpy(rkprobs->mv.class0_fr, probs->mv.class0_fr,
+	       sizeof(rkprobs->mv.class0_fr));
+	memcpy(rkprobs->mv.fr, probs->mv.fr,
+	       sizeof(rkprobs->mv.fr));
+	memcpy(rkprobs->mv.class0_hp, probs->mv.class0_hp,
+	       sizeof(rkprobs->mv.class0_hp));
+	memcpy(rkprobs->mv.hp, probs->mv.hp,
+	       sizeof(rkprobs->mv.hp));
+}
+
+static void init_probs(struct rkvdec_ctx *ctx,
+		       const struct rkvdec_vp9_run *run)
+{
+	const struct v4l2_ctrl_vp9_frame *dec_params;
+	struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv;
+	struct rkvdec_vp9_priv_tbl *tbl = vp9_ctx->priv_tbl.cpu;
+	struct rkvdec_vp9_probs *rkprobs = &tbl->probs;
+	const struct v4l2_vp9_segmentation *seg;
+	const struct v4l2_vp9_frame_context *probs;
+	bool intra_only;
+
+	dec_params = run->decode_params;
+	probs = &vp9_ctx->probability_tables;
+	seg = &dec_params->seg;
+
+	memset(rkprobs, 0, sizeof(*rkprobs));
+
+	intra_only = !!(dec_params->flags &
+			(V4L2_VP9_FRAME_FLAG_KEY_FRAME |
+			 V4L2_VP9_FRAME_FLAG_INTRA_ONLY));
+
+	/* sb info  5 x 128 bit */
+	memcpy(rkprobs->partition,
+	       intra_only ? v4l2_vp9_kf_partition_probs : probs->partition,
+	       sizeof(rkprobs->partition));
+
+	memcpy(rkprobs->pred, seg->pred_probs, sizeof(rkprobs->pred));
+	memcpy(rkprobs->tree, seg->tree_probs, sizeof(rkprobs->tree));
+	memcpy(rkprobs->skip, probs->skip, sizeof(rkprobs->skip));
+	memcpy(rkprobs->tx32, probs->tx32, sizeof(rkprobs->tx32));
+	memcpy(rkprobs->tx16, probs->tx16, sizeof(rkprobs->tx16));
+	memcpy(rkprobs->tx8, probs->tx8, sizeof(rkprobs->tx8));
+	memcpy(rkprobs->is_inter, probs->is_inter, sizeof(rkprobs->is_inter));
+
+	if (intra_only)
+		init_intra_only_probs(ctx, run);
+	else
+		init_inter_probs(ctx, run);
+}
+
+struct rkvdec_vp9_ref_reg {
+	u32 reg_frm_size;
+	u32 reg_hor_stride;
+	u32 reg_y_stride;
+	u32 reg_yuv_stride;
+	u32 reg_ref_base;
+};
+
+static struct rkvdec_vp9_ref_reg ref_regs[] = {
+	{
+		.reg_frm_size = RKVDEC_REG_VP9_FRAME_SIZE(0),
+		.reg_hor_stride = RKVDEC_VP9_HOR_VIRSTRIDE(0),
+		.reg_y_stride = RKVDEC_VP9_LAST_FRAME_YSTRIDE,
+		.reg_yuv_stride = RKVDEC_VP9_LAST_FRAME_YUVSTRIDE,
+		.reg_ref_base = RKVDEC_REG_VP9_LAST_FRAME_BASE,
+	},
+	{
+		.reg_frm_size = RKVDEC_REG_VP9_FRAME_SIZE(1),
+		.reg_hor_stride = RKVDEC_VP9_HOR_VIRSTRIDE(1),
+		.reg_y_stride = RKVDEC_VP9_GOLDEN_FRAME_YSTRIDE,
+		.reg_yuv_stride = 0,
+		.reg_ref_base = RKVDEC_REG_VP9_GOLDEN_FRAME_BASE,
+	},
+	{
+		.reg_frm_size = RKVDEC_REG_VP9_FRAME_SIZE(2),
+		.reg_hor_stride = RKVDEC_VP9_HOR_VIRSTRIDE(2),
+		.reg_y_stride = RKVDEC_VP9_ALTREF_FRAME_YSTRIDE,
+		.reg_yuv_stride = 0,
+		.reg_ref_base = RKVDEC_REG_VP9_ALTREF_FRAME_BASE,
+	}
+};
+
+static struct rkvdec_decoded_buffer *
+get_ref_buf(struct rkvdec_ctx *ctx, struct vb2_v4l2_buffer *dst, u64 timestamp)
+{
+	struct v4l2_m2m_ctx *m2m_ctx = ctx->fh.m2m_ctx;
+	struct vb2_queue *cap_q = &m2m_ctx->cap_q_ctx.q;
+	int buf_idx;
+
+	/*
+	 * If a ref is unused or invalid, address of current destination
+	 * buffer is returned.
+	 */
+	buf_idx = vb2_find_timestamp(cap_q, timestamp, 0);
+	if (buf_idx < 0)
+		return vb2_to_rkvdec_decoded_buf(&dst->vb2_buf);
+
+	return vb2_to_rkvdec_decoded_buf(vb2_get_buffer(cap_q, buf_idx));
+}
+
+static dma_addr_t get_mv_base_addr(struct rkvdec_decoded_buffer *buf)
+{
+	unsigned int aligned_pitch, aligned_height, yuv_len;
+
+	aligned_height = round_up(buf->vp9.height, 64);
+	aligned_pitch = round_up(buf->vp9.width * buf->vp9.bit_depth, 512) / 8;
+	yuv_len = (aligned_height * aligned_pitch * 3) / 2;
+
+	return vb2_dma_contig_plane_dma_addr(&buf->base.vb.vb2_buf, 0) +
+	       yuv_len;
+}
+
+static void config_ref_registers(struct rkvdec_ctx *ctx,
+				 const struct rkvdec_vp9_run *run,
+				 struct rkvdec_decoded_buffer *ref_buf,
+				 struct rkvdec_vp9_ref_reg *ref_reg)
+{
+	unsigned int aligned_pitch, aligned_height, y_len, yuv_len;
+	struct rkvdec_dev *rkvdec = ctx->dev;
+
+	aligned_height = round_up(ref_buf->vp9.height, 64);
+	writel_relaxed(RKVDEC_VP9_FRAMEWIDTH(ref_buf->vp9.width) |
+		       RKVDEC_VP9_FRAMEHEIGHT(ref_buf->vp9.height),
+		       rkvdec->regs + ref_reg->reg_frm_size);
+
+	writel_relaxed(vb2_dma_contig_plane_dma_addr(&ref_buf->base.vb.vb2_buf, 0),
+		       rkvdec->regs + ref_reg->reg_ref_base);
+
+	if (&ref_buf->base.vb == run->base.bufs.dst)
+		return;
+
+	aligned_pitch = round_up(ref_buf->vp9.width * ref_buf->vp9.bit_depth, 512) / 8;
+	y_len = aligned_height * aligned_pitch;
+	yuv_len = (y_len * 3) / 2;
+
+	writel_relaxed(RKVDEC_HOR_Y_VIRSTRIDE(aligned_pitch / 16) |
+		       RKVDEC_HOR_UV_VIRSTRIDE(aligned_pitch / 16),
+		       rkvdec->regs + ref_reg->reg_hor_stride);
+	writel_relaxed(RKVDEC_VP9_REF_YSTRIDE(y_len / 16),
+		       rkvdec->regs + ref_reg->reg_y_stride);
+
+	if (!ref_reg->reg_yuv_stride)
+		return;
+
+	writel_relaxed(RKVDEC_VP9_REF_YUVSTRIDE(yuv_len / 16),
+		       rkvdec->regs + ref_reg->reg_yuv_stride);
+}
+
+static void config_seg_registers(struct rkvdec_ctx *ctx, unsigned int segid)
+{
+	struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv;
+	const struct v4l2_vp9_segmentation *seg;
+	struct rkvdec_dev *rkvdec = ctx->dev;
+	s16 feature_val;
+	int feature_id;
+	u32 val = 0;
+
+	seg = vp9_ctx->last.valid ? &vp9_ctx->last.seg : &vp9_ctx->cur.seg;
+	feature_id = V4L2_VP9_SEG_LVL_ALT_Q;
+	if (v4l2_vp9_seg_feat_enabled(seg->feature_enabled, feature_id, segid)) {
+		feature_val = seg->feature_data[segid][feature_id];
+		val |= RKVDEC_SEGID_FRAME_QP_DELTA_EN(1) |
+		       RKVDEC_SEGID_FRAME_QP_DELTA(feature_val);
+	}
+
+	feature_id = V4L2_VP9_SEG_LVL_ALT_L;
+	if (v4l2_vp9_seg_feat_enabled(seg->feature_enabled, feature_id, segid)) {
+		feature_val = seg->feature_data[segid][feature_id];
+		val |= RKVDEC_SEGID_FRAME_LOOPFILTER_VALUE_EN(1) |
+		       RKVDEC_SEGID_FRAME_LOOPFILTER_VALUE(feature_val);
+	}
+
+	feature_id = V4L2_VP9_SEG_LVL_REF_FRAME;
+	if (v4l2_vp9_seg_feat_enabled(seg->feature_enabled, feature_id, segid)) {
+		feature_val = seg->feature_data[segid][feature_id];
+		val |= RKVDEC_SEGID_REFERINFO_EN(1) |
+		       RKVDEC_SEGID_REFERINFO(feature_val);
+	}
+
+	feature_id = V4L2_VP9_SEG_LVL_SKIP;
+	if (v4l2_vp9_seg_feat_enabled(seg->feature_enabled, feature_id, segid))
+		val |= RKVDEC_SEGID_FRAME_SKIP_EN(1);
+
+	if (!segid &&
+	    (seg->flags & V4L2_VP9_SEGMENTATION_FLAG_ABS_OR_DELTA_UPDATE))
+		val |= RKVDEC_SEGID_ABS_DELTA(1);
+
+	writel_relaxed(val, rkvdec->regs + RKVDEC_VP9_SEGID_GRP(segid));
+}
+
+static void update_dec_buf_info(struct rkvdec_decoded_buffer *buf,
+				const struct v4l2_ctrl_vp9_frame *dec_params)
+{
+	buf->vp9.width = dec_params->frame_width_minus_1 + 1;
+	buf->vp9.height = dec_params->frame_height_minus_1 + 1;
+	buf->vp9.bit_depth = dec_params->bit_depth;
+}
+
+static void update_ctx_cur_info(struct rkvdec_vp9_ctx *vp9_ctx,
+				struct rkvdec_decoded_buffer *buf,
+				const struct v4l2_ctrl_vp9_frame *dec_params)
+{
+	vp9_ctx->cur.valid = true;
+	vp9_ctx->cur.reference_mode = dec_params->reference_mode;
+	vp9_ctx->cur.interpolation_filter = dec_params->interpolation_filter;
+	vp9_ctx->cur.flags = dec_params->flags;
+	vp9_ctx->cur.timestamp = buf->base.vb.vb2_buf.timestamp;
+	vp9_ctx->cur.seg = dec_params->seg;
+	vp9_ctx->cur.lf = dec_params->lf;
+}
+
+static void update_ctx_last_info(struct rkvdec_vp9_ctx *vp9_ctx)
+{
+	vp9_ctx->last = vp9_ctx->cur;
+}
+
+static void config_registers(struct rkvdec_ctx *ctx,
+			     const struct rkvdec_vp9_run *run)
+{
+	unsigned int y_len, uv_len, yuv_len, bit_depth, aligned_height, aligned_pitch, stream_len;
+	const struct v4l2_ctrl_vp9_frame *dec_params;
+	struct rkvdec_decoded_buffer *ref_bufs[3];
+	struct rkvdec_decoded_buffer *dst, *last, *mv_ref;
+	struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv;
+	u32 val, last_frame_info = 0;
+	const struct v4l2_vp9_segmentation *seg;
+	struct rkvdec_dev *rkvdec = ctx->dev;
+	dma_addr_t addr;
+	bool intra_only;
+	unsigned int i;
+
+	dec_params = run->decode_params;
+	dst = vb2_to_rkvdec_decoded_buf(&run->base.bufs.dst->vb2_buf);
+	ref_bufs[0] = get_ref_buf(ctx, &dst->base.vb, dec_params->last_frame_ts);
+	ref_bufs[1] = get_ref_buf(ctx, &dst->base.vb, dec_params->golden_frame_ts);
+	ref_bufs[2] = get_ref_buf(ctx, &dst->base.vb, dec_params->alt_frame_ts);
+
+	if (vp9_ctx->last.valid)
+		last = get_ref_buf(ctx, &dst->base.vb, vp9_ctx->last.timestamp);
+	else
+		last = dst;
+
+	update_dec_buf_info(dst, dec_params);
+	update_ctx_cur_info(vp9_ctx, dst, dec_params);
+	seg = &dec_params->seg;
+
+	intra_only = !!(dec_params->flags &
+			(V4L2_VP9_FRAME_FLAG_KEY_FRAME |
+			 V4L2_VP9_FRAME_FLAG_INTRA_ONLY));
+
+	writel_relaxed(RKVDEC_MODE(RKVDEC_MODE_VP9),
+		       rkvdec->regs + RKVDEC_REG_SYSCTRL);
+
+	bit_depth = dec_params->bit_depth;
+	aligned_height = round_up(ctx->decoded_fmt.fmt.pix_mp.height, 64);
+
+	aligned_pitch = round_up(ctx->decoded_fmt.fmt.pix_mp.width *
+				 bit_depth,
+				 512) / 8;
+	y_len = aligned_height * aligned_pitch;
+	uv_len = y_len / 2;
+	yuv_len = y_len + uv_len;
+
+	writel_relaxed(RKVDEC_Y_HOR_VIRSTRIDE(aligned_pitch / 16) |
+		       RKVDEC_UV_HOR_VIRSTRIDE(aligned_pitch / 16),
+		       rkvdec->regs + RKVDEC_REG_PICPAR);
+	writel_relaxed(RKVDEC_Y_VIRSTRIDE(y_len / 16),
+		       rkvdec->regs + RKVDEC_REG_Y_VIRSTRIDE);
+	writel_relaxed(RKVDEC_YUV_VIRSTRIDE(yuv_len / 16),
+		       rkvdec->regs + RKVDEC_REG_YUV_VIRSTRIDE);
+
+	stream_len = vb2_get_plane_payload(&run->base.bufs.src->vb2_buf, 0);
+	writel_relaxed(RKVDEC_STRM_LEN(stream_len),
+		       rkvdec->regs + RKVDEC_REG_STRM_LEN);
+
+	/*
+	 * Reset count buffer, because decoder only output intra related syntax
+	 * counts when decoding intra frame, but update entropy need to update
+	 * all the probabilities.
+	 */
+	if (intra_only)
+		memset(vp9_ctx->count_tbl.cpu, 0, vp9_ctx->count_tbl.size);
+
+	vp9_ctx->cur.segmapid = vp9_ctx->last.segmapid;
+	if (!intra_only &&
+	    !(dec_params->flags & V4L2_VP9_FRAME_FLAG_ERROR_RESILIENT) &&
+	    (!(seg->flags & V4L2_VP9_SEGMENTATION_FLAG_ENABLED) ||
+	     (seg->flags & V4L2_VP9_SEGMENTATION_FLAG_UPDATE_MAP)))
+		vp9_ctx->cur.segmapid++;
+
+	for (i = 0; i < ARRAY_SIZE(ref_bufs); i++)
+		config_ref_registers(ctx, run, ref_bufs[i], &ref_regs[i]);
+
+	for (i = 0; i < 8; i++)
+		config_seg_registers(ctx, i);
+
+	writel_relaxed(RKVDEC_VP9_TX_MODE(vp9_ctx->cur.tx_mode) |
+		       RKVDEC_VP9_FRAME_REF_MODE(dec_params->reference_mode),
+		       rkvdec->regs + RKVDEC_VP9_CPRHEADER_CONFIG);
+
+	if (!intra_only) {
+		const struct v4l2_vp9_loop_filter *lf;
+		s8 delta;
+
+		if (vp9_ctx->last.valid)
+			lf = &vp9_ctx->last.lf;
+		else
+			lf = &vp9_ctx->cur.lf;
+
+		val = 0;
+		for (i = 0; i < ARRAY_SIZE(lf->ref_deltas); i++) {
+			delta = lf->ref_deltas[i];
+			val |= RKVDEC_REF_DELTAS_LASTFRAME(i, delta);
+		}
+
+		writel_relaxed(val,
+			       rkvdec->regs + RKVDEC_VP9_REF_DELTAS_LASTFRAME);
+
+		for (i = 0; i < ARRAY_SIZE(lf->mode_deltas); i++) {
+			delta = lf->mode_deltas[i];
+			last_frame_info |= RKVDEC_MODE_DELTAS_LASTFRAME(i,
+									delta);
+		}
+	}
+
+	if (vp9_ctx->last.valid && !intra_only &&
+	    vp9_ctx->last.seg.flags & V4L2_VP9_SEGMENTATION_FLAG_ENABLED)
+		last_frame_info |= RKVDEC_SEG_EN_LASTFRAME;
+
+	if (vp9_ctx->last.valid &&
+	    vp9_ctx->last.flags & V4L2_VP9_FRAME_FLAG_SHOW_FRAME)
+		last_frame_info |= RKVDEC_LAST_SHOW_FRAME;
+
+	if (vp9_ctx->last.valid &&
+	    vp9_ctx->last.flags &
+	    (V4L2_VP9_FRAME_FLAG_KEY_FRAME | V4L2_VP9_FRAME_FLAG_INTRA_ONLY))
+		last_frame_info |= RKVDEC_LAST_INTRA_ONLY;
+
+	if (vp9_ctx->last.valid &&
+	    last->vp9.width == dst->vp9.width &&
+	    last->vp9.height == dst->vp9.height)
+		last_frame_info |= RKVDEC_LAST_WIDHHEIGHT_EQCUR;
+
+	writel_relaxed(last_frame_info,
+		       rkvdec->regs + RKVDEC_VP9_INFO_LASTFRAME);
+
+	writel_relaxed(stream_len - dec_params->compressed_header_size -
+		       dec_params->uncompressed_header_size,
+		       rkvdec->regs + RKVDEC_VP9_LASTTILE_SIZE);
+
+	for (i = 0; !intra_only && i < ARRAY_SIZE(ref_bufs); i++) {
+		unsigned int refw = ref_bufs[i]->vp9.width;
+		unsigned int refh = ref_bufs[i]->vp9.height;
+		u32 hscale, vscale;
+
+		hscale = (refw << 14) /	dst->vp9.width;
+		vscale = (refh << 14) / dst->vp9.height;
+		writel_relaxed(RKVDEC_VP9_REF_HOR_SCALE(hscale) |
+			       RKVDEC_VP9_REF_VER_SCALE(vscale),
+			       rkvdec->regs + RKVDEC_VP9_REF_SCALE(i));
+	}
+
+	addr = vb2_dma_contig_plane_dma_addr(&dst->base.vb.vb2_buf, 0);
+	writel_relaxed(addr, rkvdec->regs + RKVDEC_REG_DECOUT_BASE);
+	addr = vb2_dma_contig_plane_dma_addr(&run->base.bufs.src->vb2_buf, 0);
+	writel_relaxed(addr, rkvdec->regs + RKVDEC_REG_STRM_RLC_BASE);
+	writel_relaxed(vp9_ctx->priv_tbl.dma +
+		       offsetof(struct rkvdec_vp9_priv_tbl, probs),
+		       rkvdec->regs + RKVDEC_REG_CABACTBL_PROB_BASE);
+	writel_relaxed(vp9_ctx->count_tbl.dma,
+		       rkvdec->regs + RKVDEC_REG_VP9COUNT_BASE);
+
+	writel_relaxed(vp9_ctx->priv_tbl.dma +
+		       offsetof(struct rkvdec_vp9_priv_tbl, segmap) +
+		       (RKVDEC_VP9_MAX_SEGMAP_SIZE * vp9_ctx->cur.segmapid),
+		       rkvdec->regs + RKVDEC_REG_VP9_SEGIDCUR_BASE);
+	writel_relaxed(vp9_ctx->priv_tbl.dma +
+		       offsetof(struct rkvdec_vp9_priv_tbl, segmap) +
+		       (RKVDEC_VP9_MAX_SEGMAP_SIZE * (!vp9_ctx->cur.segmapid)),
+		       rkvdec->regs + RKVDEC_REG_VP9_SEGIDLAST_BASE);
+
+	if (!intra_only &&
+	    !(dec_params->flags & V4L2_VP9_FRAME_FLAG_ERROR_RESILIENT) &&
+	    vp9_ctx->last.valid)
+		mv_ref = last;
+	else
+		mv_ref = dst;
+
+	writel_relaxed(get_mv_base_addr(mv_ref),
+		       rkvdec->regs + RKVDEC_VP9_REF_COLMV_BASE);
+
+	writel_relaxed(ctx->decoded_fmt.fmt.pix_mp.width |
+		       (ctx->decoded_fmt.fmt.pix_mp.height << 16),
+		       rkvdec->regs + RKVDEC_REG_PERFORMANCE_CYCLE);
+}
+
+static int validate_dec_params(struct rkvdec_ctx *ctx,
+			       const struct v4l2_ctrl_vp9_frame *dec_params)
+{
+	unsigned int aligned_width, aligned_height;
+
+	/* We only support profile 0. */
+	if (dec_params->profile != 0) {
+		dev_err(ctx->dev->dev, "unsupported profile %d\n",
+			dec_params->profile);
+		return -EINVAL;
+	}
+
+	aligned_width = round_up(dec_params->frame_width_minus_1 + 1, 64);
+	aligned_height = round_up(dec_params->frame_height_minus_1 + 1, 64);
+
+	/*
+	 * Userspace should update the capture/decoded format when the
+	 * resolution changes.
+	 */
+	if (aligned_width != ctx->decoded_fmt.fmt.pix_mp.width ||
+	    aligned_height != ctx->decoded_fmt.fmt.pix_mp.height) {
+		dev_err(ctx->dev->dev,
+			"unexpected bitstream resolution %dx%d\n",
+			dec_params->frame_width_minus_1 + 1,
+			dec_params->frame_height_minus_1 + 1);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int rkvdec_vp9_run_preamble(struct rkvdec_ctx *ctx,
+				   struct rkvdec_vp9_run *run)
+{
+	const struct v4l2_ctrl_vp9_frame *dec_params;
+	const struct v4l2_ctrl_vp9_compressed_hdr *prob_updates;
+	struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv;
+	struct v4l2_ctrl *ctrl;
+	unsigned int fctx_idx;
+	int ret;
+
+	/* v4l2-specific stuff */
+	rkvdec_run_preamble(ctx, &run->base);
+
+	ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl,
+			      V4L2_CID_STATELESS_VP9_FRAME);
+	if (WARN_ON(!ctrl))
+		return -EINVAL;
+	dec_params = ctrl->p_cur.p;
+
+	ret = validate_dec_params(ctx, dec_params);
+	if (ret)
+		return ret;
+
+	run->decode_params = dec_params;
+
+	ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, V4L2_CID_STATELESS_VP9_COMPRESSED_HDR);
+	if (WARN_ON(!ctrl))
+		return -EINVAL;
+	prob_updates = ctrl->p_cur.p;
+	vp9_ctx->cur.tx_mode = prob_updates->tx_mode;
+
+	/*
+	 * vp9 stuff
+	 *
+	 * by this point the userspace has done all parts of 6.2 uncompressed_header()
+	 * except this fragment:
+	 * if ( FrameIsIntra || error_resilient_mode ) {
+	 *	setup_past_independence ( )
+	 *	if ( frame_type == KEY_FRAME || error_resilient_mode == 1 ||
+	 *	     reset_frame_context == 3 ) {
+	 *		for ( i = 0; i < 4; i ++ ) {
+	 *			save_probs( i )
+	 *		}
+	 *	} else if ( reset_frame_context == 2 ) {
+	 *		save_probs( frame_context_idx )
+	 *	}
+	 *	frame_context_idx = 0
+	 * }
+	 */
+	fctx_idx = v4l2_vp9_reset_frame_ctx(dec_params, vp9_ctx->frame_context);
+	vp9_ctx->cur.frame_context_idx = fctx_idx;
+
+	/* 6.1 frame(sz): load_probs() and load_probs2() */
+	vp9_ctx->probability_tables = vp9_ctx->frame_context[fctx_idx];
+
+	/*
+	 * The userspace has also performed 6.3 compressed_header(), but handling the
+	 * probs in a special way. All probs which need updating, except MV-related,
+	 * have been read from the bitstream and translated through inv_map_table[],
+	 * but no 6.3.6 inv_recenter_nonneg(v, m) has been performed. The values passed
+	 * by userspace are either translated values (there are no 0 values in
+	 * inv_map_table[]), or zero to indicate no update. All MV-related probs which need
+	 * updating have been read from the bitstream and (mv_prob << 1) | 1 has been
+	 * performed. The values passed by userspace are either new values
+	 * to replace old ones (the above mentioned shift and bitwise or never result in
+	 * a zero) or zero to indicate no update.
+	 * fw_update_probs() performs actual probs updates or leaves probs as-is
+	 * for values for which a zero was passed from userspace.
+	 */
+	v4l2_vp9_fw_update_probs(&vp9_ctx->probability_tables, prob_updates, dec_params);
+
+	return 0;
+}
+
+static int rkvdec_vp9_run(struct rkvdec_ctx *ctx)
+{
+	struct rkvdec_dev *rkvdec = ctx->dev;
+	struct rkvdec_vp9_run run = { };
+	int ret;
+
+	ret = rkvdec_vp9_run_preamble(ctx, &run);
+	if (ret) {
+		rkvdec_run_postamble(ctx, &run.base);
+		return ret;
+	}
+
+	/* Prepare probs. */
+	init_probs(ctx, &run);
+
+	/* Configure hardware registers. */
+	config_registers(ctx, &run);
+
+	rkvdec_run_postamble(ctx, &run.base);
+
+	schedule_delayed_work(&rkvdec->watchdog_work, msecs_to_jiffies(2000));
+
+	writel(1, rkvdec->regs + RKVDEC_REG_PREF_LUMA_CACHE_COMMAND);
+	writel(1, rkvdec->regs + RKVDEC_REG_PREF_CHR_CACHE_COMMAND);
+
+	writel(0xe, rkvdec->regs + RKVDEC_REG_STRMD_ERR_EN);
+	/* Start decoding! */
+	writel(RKVDEC_INTERRUPT_DEC_E | RKVDEC_CONFIG_DEC_CLK_GATE_E |
+	       RKVDEC_TIMEOUT_E | RKVDEC_BUF_EMPTY_E,
+	       rkvdec->regs + RKVDEC_REG_INTERRUPT);
+
+	return 0;
+}
+
+#define copy_tx_and_skip(p1, p2)				\
+do {								\
+	memcpy((p1)->tx8, (p2)->tx8, sizeof((p1)->tx8));	\
+	memcpy((p1)->tx16, (p2)->tx16, sizeof((p1)->tx16));	\
+	memcpy((p1)->tx32, (p2)->tx32, sizeof((p1)->tx32));	\
+	memcpy((p1)->skip, (p2)->skip, sizeof((p1)->skip));	\
+} while (0)
+
+static void rkvdec_vp9_done(struct rkvdec_ctx *ctx,
+			    struct vb2_v4l2_buffer *src_buf,
+			    struct vb2_v4l2_buffer *dst_buf,
+			    enum vb2_buffer_state result)
+{
+	struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv;
+	unsigned int fctx_idx;
+
+	/* v4l2-specific stuff */
+	if (result == VB2_BUF_STATE_ERROR)
+		goto out_update_last;
+
+	/*
+	 * vp9 stuff
+	 *
+	 * 6.1.2 refresh_probs()
+	 *
+	 * In the spec a complementary condition goes last in 6.1.2 refresh_probs(),
+	 * but it makes no sense to perform all the activities from the first "if"
+	 * there if we actually are not refreshing the frame context. On top of that,
+	 * because of 6.2 uncompressed_header() whenever error_resilient_mode == 1,
+	 * refresh_frame_context == 0. Consequently, if we don't jump to out_update_last
+	 * it means error_resilient_mode must be 0.
+	 */
+	if (!(vp9_ctx->cur.flags & V4L2_VP9_FRAME_FLAG_REFRESH_FRAME_CTX))
+		goto out_update_last;
+
+	fctx_idx = vp9_ctx->cur.frame_context_idx;
+
+	if (!(vp9_ctx->cur.flags & V4L2_VP9_FRAME_FLAG_PARALLEL_DEC_MODE)) {
+		/* error_resilient_mode == 0 && frame_parallel_decoding_mode == 0 */
+		struct v4l2_vp9_frame_context *probs = &vp9_ctx->probability_tables;
+		bool frame_is_intra = vp9_ctx->cur.flags &
+		    (V4L2_VP9_FRAME_FLAG_KEY_FRAME | V4L2_VP9_FRAME_FLAG_INTRA_ONLY);
+		struct tx_and_skip {
+			u8 tx8[2][1];
+			u8 tx16[2][2];
+			u8 tx32[2][3];
+			u8 skip[3];
+		} _tx_skip, *tx_skip = &_tx_skip;
+		struct v4l2_vp9_frame_symbol_counts *counts;
+
+		/* buffer the forward-updated TX and skip probs */
+		if (frame_is_intra)
+			copy_tx_and_skip(tx_skip, probs);
+
+		/* 6.1.2 refresh_probs(): load_probs() and load_probs2() */
+		*probs = vp9_ctx->frame_context[fctx_idx];
+
+		/* if FrameIsIntra then undo the effect of load_probs2() */
+		if (frame_is_intra)
+			copy_tx_and_skip(probs, tx_skip);
+
+		counts = frame_is_intra ? &vp9_ctx->intra_cnts : &vp9_ctx->inter_cnts;
+		v4l2_vp9_adapt_coef_probs(probs, counts,
+					  !vp9_ctx->last.valid ||
+					  vp9_ctx->last.flags & V4L2_VP9_FRAME_FLAG_KEY_FRAME,
+					  frame_is_intra);
+		if (!frame_is_intra) {
+			const struct rkvdec_vp9_inter_frame_symbol_counts *inter_cnts;
+			u32 classes[2][11];
+			int i;
+
+			inter_cnts = vp9_ctx->count_tbl.cpu;
+			for (i = 0; i < ARRAY_SIZE(classes); ++i)
+				memcpy(classes[i], inter_cnts->classes[i], sizeof(classes[0]));
+			counts->classes = &classes;
+
+			/* load_probs2() already done */
+			v4l2_vp9_adapt_noncoef_probs(&vp9_ctx->probability_tables, counts,
+						     vp9_ctx->cur.reference_mode,
+						     vp9_ctx->cur.interpolation_filter,
+						     vp9_ctx->cur.tx_mode, vp9_ctx->cur.flags);
+		}
+	}
+
+	/* 6.1.2 refresh_probs(): save_probs(fctx_idx) */
+	vp9_ctx->frame_context[fctx_idx] = vp9_ctx->probability_tables;
+
+out_update_last:
+	update_ctx_last_info(vp9_ctx);
+}
+
+static void rkvdec_init_v4l2_vp9_count_tbl(struct rkvdec_ctx *ctx)
+{
+	struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv;
+	struct rkvdec_vp9_intra_frame_symbol_counts *intra_cnts = vp9_ctx->count_tbl.cpu;
+	struct rkvdec_vp9_inter_frame_symbol_counts *inter_cnts = vp9_ctx->count_tbl.cpu;
+	int i, j, k, l, m;
+
+	vp9_ctx->inter_cnts.partition = &inter_cnts->partition;
+	vp9_ctx->inter_cnts.skip = &inter_cnts->skip;
+	vp9_ctx->inter_cnts.intra_inter = &inter_cnts->inter;
+	vp9_ctx->inter_cnts.tx32p = &inter_cnts->tx32p;
+	vp9_ctx->inter_cnts.tx16p = &inter_cnts->tx16p;
+	vp9_ctx->inter_cnts.tx8p = &inter_cnts->tx8p;
+
+	vp9_ctx->intra_cnts.partition = (u32 (*)[16][4])(&intra_cnts->partition);
+	vp9_ctx->intra_cnts.skip = &intra_cnts->skip;
+	vp9_ctx->intra_cnts.intra_inter = &intra_cnts->intra;
+	vp9_ctx->intra_cnts.tx32p = &intra_cnts->tx32p;
+	vp9_ctx->intra_cnts.tx16p = &intra_cnts->tx16p;
+	vp9_ctx->intra_cnts.tx8p = &intra_cnts->tx8p;
+
+	vp9_ctx->inter_cnts.y_mode = &inter_cnts->y_mode;
+	vp9_ctx->inter_cnts.uv_mode = &inter_cnts->uv_mode;
+	vp9_ctx->inter_cnts.comp = &inter_cnts->comp;
+	vp9_ctx->inter_cnts.comp_ref = &inter_cnts->comp_ref;
+	vp9_ctx->inter_cnts.single_ref = &inter_cnts->single_ref;
+	vp9_ctx->inter_cnts.mv_mode = &inter_cnts->mv_mode;
+	vp9_ctx->inter_cnts.filter = &inter_cnts->filter;
+	vp9_ctx->inter_cnts.mv_joint = &inter_cnts->mv_joint;
+	vp9_ctx->inter_cnts.sign = &inter_cnts->sign;
+	/*
+	 * rk hardware actually uses "u32 classes[2][11 + 1];"
+	 * instead of "u32 classes[2][11];", so this must be explicitly
+	 * copied into vp9_ctx->classes when passing the data to the
+	 * vp9 library function
+	 */
+	vp9_ctx->inter_cnts.class0 = &inter_cnts->class0;
+	vp9_ctx->inter_cnts.bits = &inter_cnts->bits;
+	vp9_ctx->inter_cnts.class0_fp = &inter_cnts->class0_fp;
+	vp9_ctx->inter_cnts.fp = &inter_cnts->fp;
+	vp9_ctx->inter_cnts.class0_hp = &inter_cnts->class0_hp;
+	vp9_ctx->inter_cnts.hp = &inter_cnts->hp;
+
+#define INNERMOST_LOOP \
+	do {										\
+		for (m = 0; m < ARRAY_SIZE(vp9_ctx->inter_cnts.coeff[0][0][0][0]); ++m) {\
+			vp9_ctx->inter_cnts.coeff[i][j][k][l][m] =			\
+				&inter_cnts->ref_cnt[k][i][j][l][m].coeff;		\
+			vp9_ctx->inter_cnts.eob[i][j][k][l][m][0] =			\
+				&inter_cnts->ref_cnt[k][i][j][l][m].eob[0];		\
+			vp9_ctx->inter_cnts.eob[i][j][k][l][m][1] =			\
+				&inter_cnts->ref_cnt[k][i][j][l][m].eob[1];		\
+											\
+			vp9_ctx->intra_cnts.coeff[i][j][k][l][m] =			\
+				&intra_cnts->ref_cnt[k][i][j][l][m].coeff;		\
+			vp9_ctx->intra_cnts.eob[i][j][k][l][m][0] =			\
+				&intra_cnts->ref_cnt[k][i][j][l][m].eob[0];		\
+			vp9_ctx->intra_cnts.eob[i][j][k][l][m][1] =			\
+				&intra_cnts->ref_cnt[k][i][j][l][m].eob[1];		\
+		}									\
+	} while (0)
+
+	for (i = 0; i < ARRAY_SIZE(vp9_ctx->inter_cnts.coeff); ++i)
+		for (j = 0; j < ARRAY_SIZE(vp9_ctx->inter_cnts.coeff[0]); ++j)
+			for (k = 0; k < ARRAY_SIZE(vp9_ctx->inter_cnts.coeff[0][0]); ++k)
+				for (l = 0; l < ARRAY_SIZE(vp9_ctx->inter_cnts.coeff[0][0][0]); ++l)
+					INNERMOST_LOOP;
+#undef INNERMOST_LOOP
+}
+
+static int rkvdec_vp9_start(struct rkvdec_ctx *ctx)
+{
+	struct rkvdec_dev *rkvdec = ctx->dev;
+	struct rkvdec_vp9_priv_tbl *priv_tbl;
+	struct rkvdec_vp9_ctx *vp9_ctx;
+	unsigned char *count_tbl;
+	int ret;
+
+	vp9_ctx = kzalloc(sizeof(*vp9_ctx), GFP_KERNEL);
+	if (!vp9_ctx)
+		return -ENOMEM;
+
+	ctx->priv = vp9_ctx;
+
+	priv_tbl = dma_alloc_coherent(rkvdec->dev, sizeof(*priv_tbl),
+				      &vp9_ctx->priv_tbl.dma, GFP_KERNEL);
+	if (!priv_tbl) {
+		ret = -ENOMEM;
+		goto err_free_ctx;
+	}
+
+	vp9_ctx->priv_tbl.size = sizeof(*priv_tbl);
+	vp9_ctx->priv_tbl.cpu = priv_tbl;
+	memset(priv_tbl, 0, sizeof(*priv_tbl));
+
+	count_tbl = dma_alloc_coherent(rkvdec->dev, RKVDEC_VP9_COUNT_SIZE,
+				       &vp9_ctx->count_tbl.dma, GFP_KERNEL);
+	if (!count_tbl) {
+		ret = -ENOMEM;
+		goto err_free_priv_tbl;
+	}
+
+	vp9_ctx->count_tbl.size = RKVDEC_VP9_COUNT_SIZE;
+	vp9_ctx->count_tbl.cpu = count_tbl;
+	memset(count_tbl, 0, sizeof(*count_tbl));
+	rkvdec_init_v4l2_vp9_count_tbl(ctx);
+
+	return 0;
+
+err_free_priv_tbl:
+	dma_free_coherent(rkvdec->dev, vp9_ctx->priv_tbl.size,
+			  vp9_ctx->priv_tbl.cpu, vp9_ctx->priv_tbl.dma);
+
+err_free_ctx:
+	kfree(vp9_ctx);
+	return ret;
+}
+
+static void rkvdec_vp9_stop(struct rkvdec_ctx *ctx)
+{
+	struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv;
+	struct rkvdec_dev *rkvdec = ctx->dev;
+
+	dma_free_coherent(rkvdec->dev, vp9_ctx->count_tbl.size,
+			  vp9_ctx->count_tbl.cpu, vp9_ctx->count_tbl.dma);
+	dma_free_coherent(rkvdec->dev, vp9_ctx->priv_tbl.size,
+			  vp9_ctx->priv_tbl.cpu, vp9_ctx->priv_tbl.dma);
+	kfree(vp9_ctx);
+}
+
+static int rkvdec_vp9_adjust_fmt(struct rkvdec_ctx *ctx,
+				 struct v4l2_format *f)
+{
+	struct v4l2_pix_format_mplane *fmt = &f->fmt.pix_mp;
+
+	fmt->num_planes = 1;
+	if (!fmt->plane_fmt[0].sizeimage)
+		fmt->plane_fmt[0].sizeimage = fmt->width * fmt->height * 2;
+	return 0;
+}
+
+const struct rkvdec_coded_fmt_ops rkvdec_vp9_fmt_ops = {
+	.adjust_fmt = rkvdec_vp9_adjust_fmt,
+	.start = rkvdec_vp9_start,
+	.stop = rkvdec_vp9_stop,
+	.run = rkvdec_vp9_run,
+	.done = rkvdec_vp9_done,
+};
diff --git a/drivers/staging/media/rkvdec/rkvdec.c b/drivers/staging/media/rkvdec/rkvdec.c
index 7131156c1f2c..6aa8aca66547 100644
--- a/drivers/staging/media/rkvdec/rkvdec.c
+++ b/drivers/staging/media/rkvdec/rkvdec.c
@@ -99,10 +99,30 @@ static const struct rkvdec_ctrls rkvdec_h264_ctrls = {
 	.num_ctrls = ARRAY_SIZE(rkvdec_h264_ctrl_descs),
 };
 
-static const u32 rkvdec_h264_decoded_fmts[] = {
+static const u32 rkvdec_h264_vp9_decoded_fmts[] = {
 	V4L2_PIX_FMT_NV12,
 };
 
+static const struct rkvdec_ctrl_desc rkvdec_vp9_ctrl_descs[] = {
+	{
+		.cfg.id = V4L2_CID_STATELESS_VP9_FRAME,
+	},
+	{
+		.cfg.id = V4L2_CID_STATELESS_VP9_COMPRESSED_HDR,
+	},
+	{
+		.cfg.id = V4L2_CID_MPEG_VIDEO_VP9_PROFILE,
+		.cfg.min = V4L2_MPEG_VIDEO_VP9_PROFILE_0,
+		.cfg.max = V4L2_MPEG_VIDEO_VP9_PROFILE_0,
+		.cfg.def = V4L2_MPEG_VIDEO_VP9_PROFILE_0,
+	},
+};
+
+static const struct rkvdec_ctrls rkvdec_vp9_ctrls = {
+	.ctrls = rkvdec_vp9_ctrl_descs,
+	.num_ctrls = ARRAY_SIZE(rkvdec_vp9_ctrl_descs),
+};
+
 static const struct rkvdec_coded_fmt_desc rkvdec_coded_fmts[] = {
 	{
 		.fourcc = V4L2_PIX_FMT_H264_SLICE,
@@ -116,8 +136,23 @@ static const struct rkvdec_coded_fmt_desc rkvdec_coded_fmts[] = {
 		},
 		.ctrls = &rkvdec_h264_ctrls,
 		.ops = &rkvdec_h264_fmt_ops,
-		.num_decoded_fmts = ARRAY_SIZE(rkvdec_h264_decoded_fmts),
-		.decoded_fmts = rkvdec_h264_decoded_fmts,
+		.num_decoded_fmts = ARRAY_SIZE(rkvdec_h264_vp9_decoded_fmts),
+		.decoded_fmts = rkvdec_h264_vp9_decoded_fmts,
+	},
+	{
+		.fourcc = V4L2_PIX_FMT_VP9_FRAME,
+		.frmsize = {
+			.min_width = 64,
+			.max_width = 4096,
+			.step_width = 64,
+			.min_height = 64,
+			.max_height = 2304,
+			.step_height = 64,
+		},
+		.ctrls = &rkvdec_vp9_ctrls,
+		.ops = &rkvdec_vp9_fmt_ops,
+		.num_decoded_fmts = ARRAY_SIZE(rkvdec_h264_vp9_decoded_fmts),
+		.decoded_fmts = rkvdec_h264_vp9_decoded_fmts,
 	}
 };
 
@@ -319,7 +354,7 @@ static int rkvdec_s_output_fmt(struct file *file, void *priv,
 	struct v4l2_m2m_ctx *m2m_ctx = ctx->fh.m2m_ctx;
 	const struct rkvdec_coded_fmt_desc *desc;
 	struct v4l2_format *cap_fmt;
-	struct vb2_queue *peer_vq;
+	struct vb2_queue *peer_vq, *vq;
 	int ret;
 
 	/*
@@ -331,6 +366,15 @@ static int rkvdec_s_output_fmt(struct file *file, void *priv,
 	if (vb2_is_busy(peer_vq))
 		return -EBUSY;
 
+	/*
+	 * Some codecs like VP9 can contain dynamic resolution changes which
+	 * are currently not supported by the V4L2 API or driver, so return
+	 * an error if userspace tries to reconfigure the output format.
+	 */
+	vq = v4l2_m2m_get_vq(m2m_ctx, V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE);
+	if (vb2_is_busy(vq))
+		return -EINVAL;
+
 	ret = rkvdec_s_fmt(file, priv, f, rkvdec_try_output_fmt);
 	if (ret)
 		return ret;
diff --git a/drivers/staging/media/rkvdec/rkvdec.h b/drivers/staging/media/rkvdec/rkvdec.h
index 52ac3874c5e5..2f4ea1786b93 100644
--- a/drivers/staging/media/rkvdec/rkvdec.h
+++ b/drivers/staging/media/rkvdec/rkvdec.h
@@ -42,14 +42,18 @@ struct rkvdec_run {
 
 struct rkvdec_vp9_decoded_buffer_info {
 	/* Info needed when the decoded frame serves as a reference frame. */
-	u16 width;
-	u16 height;
-	u32 bit_depth : 4;
+	unsigned short width;
+	unsigned short height;
+	unsigned int bit_depth : 4;
 };
 
 struct rkvdec_decoded_buffer {
 	/* Must be the first field in this struct. */
 	struct v4l2_m2m_buffer base;
+
+	union {
+		struct rkvdec_vp9_decoded_buffer_info vp9;
+	};
 };
 
 static inline struct rkvdec_decoded_buffer *
@@ -116,4 +120,6 @@ void rkvdec_run_preamble(struct rkvdec_ctx *ctx, struct rkvdec_run *run);
 void rkvdec_run_postamble(struct rkvdec_ctx *ctx, struct rkvdec_run *run);
 
 extern const struct rkvdec_coded_fmt_ops rkvdec_h264_fmt_ops;
+extern const struct rkvdec_coded_fmt_ops rkvdec_vp9_fmt_ops;
+
 #endif /* RKVDEC_H_ */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v7 08/11] media: hantro: Rename registers
  2021-09-29 16:04 [PATCH v7 00/11] VP9 codec V4L2 control interface Andrzej Pietrasiewicz
                   ` (6 preceding siblings ...)
  2021-09-29 16:04 ` [PATCH v7 07/11] media: rkvdec: Add the VP9 backend Andrzej Pietrasiewicz
@ 2021-09-29 16:04 ` Andrzej Pietrasiewicz
  2021-09-29 16:04 ` [PATCH v7 09/11] media: hantro: Prepare for other G2 codecs Andrzej Pietrasiewicz
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 37+ messages in thread
From: Andrzej Pietrasiewicz @ 2021-09-29 16:04 UTC (permalink / raw)
  To: linux-media, linux-arm-kernel, linux-kernel, linux-rockchip,
	linux-staging
  Cc: Andrzej Pietrasiewicz, Benjamin Gaignard, Boris Brezillon,
	Ezequiel Garcia, Fabio Estevam, Greg Kroah-Hartman, Hans Verkuil,
	Heiko Stuebner, Jernej Skrabec, Mauro Carvalho Chehab,
	Nicolas Dufresne, NXP Linux Team, Pengutronix Kernel Team,
	Philipp Zabel, Sascha Hauer, Shawn Guo, kernel

Add more consistency in the way registers are named.

Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
---
 .../staging/media/hantro/hantro_g2_hevc_dec.c | 38 +++++++++----------
 drivers/staging/media/hantro/hantro_g2_regs.h | 28 +++++++-------
 2 files changed, 33 insertions(+), 33 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_g2_hevc_dec.c b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
index 340efb57fd18..97da719a9844 100644
--- a/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
+++ b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
@@ -448,9 +448,9 @@ static int set_ref(struct hantro_ctx *ctx)
 		if (dpb[i].rps == V4L2_HEVC_DPB_ENTRY_RPS_LT_CURR)
 			dpb_longterm_e |= BIT(V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1 - i);
 
-		hantro_write_addr(vpu, G2_REG_ADDR_REF(i), luma_addr);
-		hantro_write_addr(vpu, G2_REG_CHR_REF(i), chroma_addr);
-		hantro_write_addr(vpu, G2_REG_DMV_REF(i), mv_addr);
+		hantro_write_addr(vpu, G2_REF_LUMA_ADDR(i), luma_addr);
+		hantro_write_addr(vpu, G2_REF_CHROMA_ADDR(i), chroma_addr);
+		hantro_write_addr(vpu, G2_REF_MV_ADDR(i), mv_addr);
 	}
 
 	luma_addr = hantro_hevc_get_ref_buf(ctx, decode_params->pic_order_cnt_val);
@@ -460,20 +460,20 @@ static int set_ref(struct hantro_ctx *ctx)
 	chroma_addr = luma_addr + cr_offset;
 	mv_addr = luma_addr + mv_offset;
 
-	hantro_write_addr(vpu, G2_REG_ADDR_REF(i), luma_addr);
-	hantro_write_addr(vpu, G2_REG_CHR_REF(i), chroma_addr);
-	hantro_write_addr(vpu, G2_REG_DMV_REF(i++), mv_addr);
+	hantro_write_addr(vpu, G2_REF_LUMA_ADDR(i), luma_addr);
+	hantro_write_addr(vpu, G2_REF_CHROMA_ADDR(i), chroma_addr);
+	hantro_write_addr(vpu, G2_REF_MV_ADDR(i++), mv_addr);
 
-	hantro_write_addr(vpu, G2_ADDR_DST, luma_addr);
-	hantro_write_addr(vpu, G2_ADDR_DST_CHR, chroma_addr);
-	hantro_write_addr(vpu, G2_ADDR_DST_MV, mv_addr);
+	hantro_write_addr(vpu, G2_OUT_LUMA_ADDR, luma_addr);
+	hantro_write_addr(vpu, G2_OUT_CHROMA_ADDR, chroma_addr);
+	hantro_write_addr(vpu, G2_OUT_MV_ADDR, mv_addr);
 
 	hantro_hevc_ref_remove_unused(ctx);
 
 	for (; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
-		hantro_write_addr(vpu, G2_REG_ADDR_REF(i), 0);
-		hantro_write_addr(vpu, G2_REG_CHR_REF(i), 0);
-		hantro_write_addr(vpu, G2_REG_DMV_REF(i), 0);
+		hantro_write_addr(vpu, G2_REF_LUMA_ADDR(i), 0);
+		hantro_write_addr(vpu, G2_REF_CHROMA_ADDR(i), 0);
+		hantro_write_addr(vpu, G2_REF_MV_ADDR(i), 0);
 	}
 
 	hantro_reg_write(vpu, &g2_refer_lterm_e, dpb_longterm_e);
@@ -499,7 +499,7 @@ static void set_buffers(struct hantro_ctx *ctx)
 	src_len = vb2_get_plane_payload(&src_buf->vb2_buf, 0);
 	src_buf_len = vb2_plane_size(&src_buf->vb2_buf, 0);
 
-	hantro_write_addr(vpu, G2_ADDR_STR, src_dma);
+	hantro_write_addr(vpu, G2_STREAM_ADDR, src_dma);
 	hantro_reg_write(vpu, &g2_stream_len, src_len);
 	hantro_reg_write(vpu, &g2_strm_buffer_len, src_buf_len);
 	hantro_reg_write(vpu, &g2_strm_start_offset, 0);
@@ -508,12 +508,12 @@ static void set_buffers(struct hantro_ctx *ctx)
 	/* Destination (decoded frame) buffer. */
 	dst_dma = hantro_get_dec_buf_addr(ctx, &dst_buf->vb2_buf);
 
-	hantro_write_addr(vpu, G2_RASTER_SCAN, dst_dma);
-	hantro_write_addr(vpu, G2_RASTER_SCAN_CHR, dst_dma + cr_offset);
-	hantro_write_addr(vpu, G2_ADDR_TILE_SIZE, ctx->hevc_dec.tile_sizes.dma);
-	hantro_write_addr(vpu, G2_TILE_FILTER, ctx->hevc_dec.tile_filter.dma);
-	hantro_write_addr(vpu, G2_TILE_SAO, ctx->hevc_dec.tile_sao.dma);
-	hantro_write_addr(vpu, G2_TILE_BSD, ctx->hevc_dec.tile_bsd.dma);
+	hantro_write_addr(vpu, G2_RS_OUT_LUMA_ADDR, dst_dma);
+	hantro_write_addr(vpu, G2_RS_OUT_CHROMA_ADDR, dst_dma + cr_offset);
+	hantro_write_addr(vpu, G2_TILE_SIZES_ADDR, ctx->hevc_dec.tile_sizes.dma);
+	hantro_write_addr(vpu, G2_TILE_FILTER_ADDR, ctx->hevc_dec.tile_filter.dma);
+	hantro_write_addr(vpu, G2_TILE_SAO_ADDR, ctx->hevc_dec.tile_sao.dma);
+	hantro_write_addr(vpu, G2_TILE_BSD_ADDR, ctx->hevc_dec.tile_bsd.dma);
 }
 
 static void hantro_g2_check_idle(struct hantro_dev *vpu)
diff --git a/drivers/staging/media/hantro/hantro_g2_regs.h b/drivers/staging/media/hantro/hantro_g2_regs.h
index bb22fa921914..24b18f839ff8 100644
--- a/drivers/staging/media/hantro/hantro_g2_regs.h
+++ b/drivers/staging/media/hantro/hantro_g2_regs.h
@@ -177,20 +177,20 @@
 #define G2_REG_CONFIG_DEC_CLK_GATE_E		BIT(16)
 #define G2_REG_CONFIG_DEC_CLK_GATE_IDLE_E	BIT(17)
 
-#define G2_ADDR_DST		(G2_SWREG(65))
-#define G2_REG_ADDR_REF(i)	(G2_SWREG(67)  + ((i) * 0x8))
-#define G2_ADDR_DST_CHR		(G2_SWREG(99))
-#define G2_REG_CHR_REF(i)	(G2_SWREG(101) + ((i) * 0x8))
-#define G2_ADDR_DST_MV		(G2_SWREG(133))
-#define G2_REG_DMV_REF(i)	(G2_SWREG(135) + ((i) * 0x8))
-#define G2_ADDR_TILE_SIZE	(G2_SWREG(167))
-#define G2_ADDR_STR		(G2_SWREG(169))
-#define HEVC_SCALING_LIST	(G2_SWREG(171))
-#define G2_RASTER_SCAN		(G2_SWREG(175))
-#define G2_RASTER_SCAN_CHR	(G2_SWREG(177))
-#define G2_TILE_FILTER		(G2_SWREG(179))
-#define G2_TILE_SAO		(G2_SWREG(181))
-#define G2_TILE_BSD		(G2_SWREG(183))
+#define G2_OUT_LUMA_ADDR		(G2_SWREG(65))
+#define G2_REF_LUMA_ADDR(i)		(G2_SWREG(67)  + ((i) * 0x8))
+#define G2_OUT_CHROMA_ADDR		(G2_SWREG(99))
+#define G2_REF_CHROMA_ADDR(i)		(G2_SWREG(101) + ((i) * 0x8))
+#define G2_OUT_MV_ADDR			(G2_SWREG(133))
+#define G2_REF_MV_ADDR(i)		(G2_SWREG(135) + ((i) * 0x8))
+#define G2_TILE_SIZES_ADDR		(G2_SWREG(167))
+#define G2_STREAM_ADDR			(G2_SWREG(169))
+#define G2_HEVC_SCALING_LIST_ADDR	(G2_SWREG(171))
+#define G2_RS_OUT_LUMA_ADDR		(G2_SWREG(175))
+#define G2_RS_OUT_CHROMA_ADDR		(G2_SWREG(177))
+#define G2_TILE_FILTER_ADDR		(G2_SWREG(179))
+#define G2_TILE_SAO_ADDR		(G2_SWREG(181))
+#define G2_TILE_BSD_ADDR		(G2_SWREG(183))
 
 #define g2_strm_buffer_len	G2_DEC_REG(258, 0, 0xffffffff)
 #define g2_strm_start_offset	G2_DEC_REG(259, 0, 0xffffffff)
-- 
2.17.1


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v7 09/11] media: hantro: Prepare for other G2 codecs
  2021-09-29 16:04 [PATCH v7 00/11] VP9 codec V4L2 control interface Andrzej Pietrasiewicz
                   ` (7 preceding siblings ...)
  2021-09-29 16:04 ` [PATCH v7 08/11] media: hantro: Rename registers Andrzej Pietrasiewicz
@ 2021-09-29 16:04 ` Andrzej Pietrasiewicz
  2021-09-29 16:04 ` [PATCH v7 10/11] media: hantro: Support VP9 on the G2 core Andrzej Pietrasiewicz
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 37+ messages in thread
From: Andrzej Pietrasiewicz @ 2021-09-29 16:04 UTC (permalink / raw)
  To: linux-media, linux-arm-kernel, linux-kernel, linux-rockchip,
	linux-staging
  Cc: Andrzej Pietrasiewicz, Benjamin Gaignard, Boris Brezillon,
	Ezequiel Garcia, Fabio Estevam, Greg Kroah-Hartman, Hans Verkuil,
	Heiko Stuebner, Jernej Skrabec, Mauro Carvalho Chehab,
	Nicolas Dufresne, NXP Linux Team, Pengutronix Kernel Team,
	Philipp Zabel, Sascha Hauer, Shawn Guo, kernel

VeriSilicon Hantro G2 core supports other codecs besides hevc.
Factor out some common code in preparation for vp9 support.

Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
Reviewed-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
 drivers/staging/media/hantro/Makefile         |  1 +
 drivers/staging/media/hantro/hantro.h         |  7 +++++
 drivers/staging/media/hantro/hantro_drv.c     |  5 +++
 drivers/staging/media/hantro/hantro_g2.c      | 27 ++++++++++++++++
 .../staging/media/hantro/hantro_g2_hevc_dec.c | 31 -------------------
 drivers/staging/media/hantro/hantro_g2_regs.h |  7 +++++
 drivers/staging/media/hantro/hantro_hw.h      |  2 ++
 7 files changed, 49 insertions(+), 31 deletions(-)
 create mode 100644 drivers/staging/media/hantro/hantro_g2.c

diff --git a/drivers/staging/media/hantro/Makefile b/drivers/staging/media/hantro/Makefile
index 90036831fec4..fe6d84871d07 100644
--- a/drivers/staging/media/hantro/Makefile
+++ b/drivers/staging/media/hantro/Makefile
@@ -12,6 +12,7 @@ hantro-vpu-y += \
 		hantro_g1_mpeg2_dec.o \
 		hantro_g2_hevc_dec.o \
 		hantro_g1_vp8_dec.o \
+		hantro_g2.o \
 		rockchip_vpu2_hw_jpeg_enc.o \
 		rockchip_vpu2_hw_h264_dec.o \
 		rockchip_vpu2_hw_mpeg2_dec.o \
diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
index dd5e56765d4e..d91eb2b1c509 100644
--- a/drivers/staging/media/hantro/hantro.h
+++ b/drivers/staging/media/hantro/hantro.h
@@ -369,6 +369,13 @@ static inline void vdpu_write(struct hantro_dev *vpu, u32 val, u32 reg)
 	writel(val, vpu->dec_base + reg);
 }
 
+static inline void hantro_write_addr(struct hantro_dev *vpu,
+				     unsigned long offset,
+				     dma_addr_t addr)
+{
+	vdpu_write(vpu, addr & 0xffffffff, offset);
+}
+
 static inline u32 vdpu_read(struct hantro_dev *vpu, u32 reg)
 {
 	u32 val = readl(vpu->dec_base + reg);
diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
index 8a2edd67f2c6..e8eee117d97f 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -905,6 +905,11 @@ static int hantro_probe(struct platform_device *pdev)
 	vpu->enc_base = vpu->reg_bases[0] + vpu->variant->enc_offset;
 	vpu->dec_base = vpu->reg_bases[0] + vpu->variant->dec_offset;
 
+	/**
+	 * TODO: Eventually allow taking advantage of full 64-bit address space.
+	 * Until then we assume the MSB portion of buffers' base addresses is
+	 * always 0 due to this masking operation.
+	 */
 	ret = dma_set_coherent_mask(vpu->dev, DMA_BIT_MASK(32));
 	if (ret) {
 		dev_err(vpu->dev, "Could not set DMA coherent mask.\n");
diff --git a/drivers/staging/media/hantro/hantro_g2.c b/drivers/staging/media/hantro/hantro_g2.c
new file mode 100644
index 000000000000..5f7bb27913de
--- /dev/null
+++ b/drivers/staging/media/hantro/hantro_g2.c
@@ -0,0 +1,27 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Hantro VPU codec driver
+ *
+ * Copyright (C) 2021 Collabora Ltd, Andrzej Pietrasiewicz <andrzej.p@collabora.com>
+ */
+
+#include "hantro_hw.h"
+#include "hantro_g2_regs.h"
+
+void hantro_g2_check_idle(struct hantro_dev *vpu)
+{
+	int i;
+
+	for (i = 0; i < 3; i++) {
+		u32 status;
+
+		/* Make sure the VPU is idle */
+		status = vdpu_read(vpu, G2_REG_INTERRUPT);
+		if (status & G2_REG_INTERRUPT_DEC_E) {
+			dev_warn(vpu->dev, "device still running, aborting");
+			status |= G2_REG_INTERRUPT_DEC_ABORT_E | G2_REG_INTERRUPT_DEC_IRQ_DIS;
+			vdpu_write(vpu, status, G2_REG_INTERRUPT);
+		}
+	}
+}
+
diff --git a/drivers/staging/media/hantro/hantro_g2_hevc_dec.c b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
index 97da719a9844..2797825cef47 100644
--- a/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
+++ b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
@@ -8,20 +8,6 @@
 #include "hantro_hw.h"
 #include "hantro_g2_regs.h"
 
-#define HEVC_DEC_MODE	0xC
-
-#define BUS_WIDTH_32		0
-#define BUS_WIDTH_64		1
-#define BUS_WIDTH_128		2
-#define BUS_WIDTH_256		3
-
-static inline void hantro_write_addr(struct hantro_dev *vpu,
-				     unsigned long offset,
-				     dma_addr_t addr)
-{
-	vdpu_write(vpu, addr & 0xffffffff, offset);
-}
-
 static void prepare_tile_info_buffer(struct hantro_ctx *ctx)
 {
 	struct hantro_dev *vpu = ctx->dev;
@@ -516,23 +502,6 @@ static void set_buffers(struct hantro_ctx *ctx)
 	hantro_write_addr(vpu, G2_TILE_BSD_ADDR, ctx->hevc_dec.tile_bsd.dma);
 }
 
-static void hantro_g2_check_idle(struct hantro_dev *vpu)
-{
-	int i;
-
-	for (i = 0; i < 3; i++) {
-		u32 status;
-
-		/* Make sure the VPU is idle */
-		status = vdpu_read(vpu, G2_REG_INTERRUPT);
-		if (status & G2_REG_INTERRUPT_DEC_E) {
-			dev_warn(vpu->dev, "device still running, aborting");
-			status |= G2_REG_INTERRUPT_DEC_ABORT_E | G2_REG_INTERRUPT_DEC_IRQ_DIS;
-			vdpu_write(vpu, status, G2_REG_INTERRUPT);
-		}
-	}
-}
-
 int hantro_g2_hevc_dec_run(struct hantro_ctx *ctx)
 {
 	struct hantro_dev *vpu = ctx->dev;
diff --git a/drivers/staging/media/hantro/hantro_g2_regs.h b/drivers/staging/media/hantro/hantro_g2_regs.h
index 24b18f839ff8..136ba6d98a1f 100644
--- a/drivers/staging/media/hantro/hantro_g2_regs.h
+++ b/drivers/staging/media/hantro/hantro_g2_regs.h
@@ -27,6 +27,13 @@
 #define G2_REG_INTERRUPT_DEC_IRQ_DIS	BIT(4)
 #define G2_REG_INTERRUPT_DEC_E		BIT(0)
 
+#define HEVC_DEC_MODE			0xc
+
+#define BUS_WIDTH_32			0
+#define BUS_WIDTH_64			1
+#define BUS_WIDTH_128			2
+#define BUS_WIDTH_256			3
+
 #define g2_strm_swap		G2_DEC_REG(2, 28, 0xf)
 #define g2_dirmv_swap		G2_DEC_REG(2, 20, 0xf)
 
diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
index 4323e63dfbfc..42b3f3961f75 100644
--- a/drivers/staging/media/hantro/hantro_hw.h
+++ b/drivers/staging/media/hantro/hantro_hw.h
@@ -308,4 +308,6 @@ void hantro_vp8_dec_exit(struct hantro_ctx *ctx);
 void hantro_vp8_prob_update(struct hantro_ctx *ctx,
 			    const struct v4l2_ctrl_vp8_frame *hdr);
 
+void hantro_g2_check_idle(struct hantro_dev *vpu);
+
 #endif /* HANTRO_HW_H_ */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v7 10/11] media: hantro: Support VP9 on the G2 core
  2021-09-29 16:04 [PATCH v7 00/11] VP9 codec V4L2 control interface Andrzej Pietrasiewicz
                   ` (8 preceding siblings ...)
  2021-09-29 16:04 ` [PATCH v7 09/11] media: hantro: Prepare for other G2 codecs Andrzej Pietrasiewicz
@ 2021-09-29 16:04 ` Andrzej Pietrasiewicz
  2021-09-29 16:04 ` [PATCH v7 11/11] media: hantro: Support NV12 " Andrzej Pietrasiewicz
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 37+ messages in thread
From: Andrzej Pietrasiewicz @ 2021-09-29 16:04 UTC (permalink / raw)
  To: linux-media, linux-arm-kernel, linux-kernel, linux-rockchip,
	linux-staging
  Cc: Andrzej Pietrasiewicz, Benjamin Gaignard, Boris Brezillon,
	Ezequiel Garcia, Fabio Estevam, Greg Kroah-Hartman, Hans Verkuil,
	Heiko Stuebner, Jernej Skrabec, Mauro Carvalho Chehab,
	Nicolas Dufresne, NXP Linux Team, Pengutronix Kernel Team,
	Philipp Zabel, Sascha Hauer, Shawn Guo, kernel

VeriSilicon Hantro G2 core supports VP9 codec.

Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
Reviewed-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
 drivers/staging/media/hantro/Kconfig          |   1 +
 drivers/staging/media/hantro/Makefile         |   6 +-
 drivers/staging/media/hantro/hantro.h         |  26 +
 drivers/staging/media/hantro/hantro_drv.c     |  18 +-
 drivers/staging/media/hantro/hantro_g2_regs.h |  97 ++
 .../staging/media/hantro/hantro_g2_vp9_dec.c  | 980 ++++++++++++++++++
 drivers/staging/media/hantro/hantro_hw.h      |  67 ++
 drivers/staging/media/hantro/hantro_v4l2.c    |   6 +
 drivers/staging/media/hantro/hantro_vp9.c     | 240 +++++
 drivers/staging/media/hantro/hantro_vp9.h     | 103 ++
 drivers/staging/media/hantro/imx8m_vpu_hw.c   |  22 +-
 11 files changed, 1562 insertions(+), 4 deletions(-)
 create mode 100644 drivers/staging/media/hantro/hantro_g2_vp9_dec.c
 create mode 100644 drivers/staging/media/hantro/hantro_vp9.c
 create mode 100644 drivers/staging/media/hantro/hantro_vp9.h

diff --git a/drivers/staging/media/hantro/Kconfig b/drivers/staging/media/hantro/Kconfig
index 20b1f6d7b69c..00a57d88c92e 100644
--- a/drivers/staging/media/hantro/Kconfig
+++ b/drivers/staging/media/hantro/Kconfig
@@ -9,6 +9,7 @@ config VIDEO_HANTRO
 	select VIDEOBUF2_VMALLOC
 	select V4L2_MEM2MEM_DEV
 	select V4L2_H264
+	select V4L2_VP9
 	help
 	  Support for the Hantro IP based Video Processing Units present on
 	  Rockchip and NXP i.MX8M SoCs, which accelerate video and image
diff --git a/drivers/staging/media/hantro/Makefile b/drivers/staging/media/hantro/Makefile
index fe6d84871d07..28af0a1ee4bf 100644
--- a/drivers/staging/media/hantro/Makefile
+++ b/drivers/staging/media/hantro/Makefile
@@ -10,9 +10,10 @@ hantro-vpu-y += \
 		hantro_g1.o \
 		hantro_g1_h264_dec.o \
 		hantro_g1_mpeg2_dec.o \
-		hantro_g2_hevc_dec.o \
 		hantro_g1_vp8_dec.o \
 		hantro_g2.o \
+		hantro_g2_hevc_dec.o \
+		hantro_g2_vp9_dec.o \
 		rockchip_vpu2_hw_jpeg_enc.o \
 		rockchip_vpu2_hw_h264_dec.o \
 		rockchip_vpu2_hw_mpeg2_dec.o \
@@ -21,7 +22,8 @@ hantro-vpu-y += \
 		hantro_h264.o \
 		hantro_hevc.o \
 		hantro_mpeg2.o \
-		hantro_vp8.o
+		hantro_vp8.o \
+		hantro_vp9.o
 
 hantro-vpu-$(CONFIG_VIDEO_HANTRO_IMX8M) += \
 		imx8m_vpu_hw.o
diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
index d91eb2b1c509..1e8c1a6e3eb0 100644
--- a/drivers/staging/media/hantro/hantro.h
+++ b/drivers/staging/media/hantro/hantro.h
@@ -36,6 +36,7 @@ struct hantro_postproc_ops;
 #define HANTRO_VP8_DECODER	BIT(17)
 #define HANTRO_H264_DECODER	BIT(18)
 #define HANTRO_HEVC_DECODER	BIT(19)
+#define HANTRO_VP9_DECODER	BIT(20)
 #define HANTRO_DECODERS		0xffff0000
 
 /**
@@ -110,6 +111,7 @@ enum hantro_codec_mode {
 	HANTRO_MODE_MPEG2_DEC,
 	HANTRO_MODE_VP8_DEC,
 	HANTRO_MODE_HEVC_DEC,
+	HANTRO_MODE_VP9_DEC,
 };
 
 /*
@@ -223,6 +225,7 @@ struct hantro_dev {
  * @mpeg2_dec:		MPEG-2-decoding context.
  * @vp8_dec:		VP8-decoding context.
  * @hevc_dec:		HEVC-decoding context.
+ * @vp9_dec:		VP9-decoding context.
  */
 struct hantro_ctx {
 	struct hantro_dev *dev;
@@ -250,6 +253,7 @@ struct hantro_ctx {
 		struct hantro_mpeg2_dec_hw_ctx mpeg2_dec;
 		struct hantro_vp8_dec_hw_ctx vp8_dec;
 		struct hantro_hevc_dec_hw_ctx hevc_dec;
+		struct hantro_vp9_dec_hw_ctx vp9_dec;
 	};
 };
 
@@ -299,6 +303,22 @@ struct hantro_postproc_regs {
 	struct hantro_reg display_width;
 };
 
+struct hantro_vp9_decoded_buffer_info {
+	/* Info needed when the decoded frame serves as a reference frame. */
+	unsigned short width;
+	unsigned short height;
+	u32 bit_depth : 4;
+};
+
+struct hantro_decoded_buffer {
+	/* Must be the first field in this struct. */
+	struct v4l2_m2m_buffer base;
+
+	union {
+		struct hantro_vp9_decoded_buffer_info vp9;
+	};
+};
+
 /* Logging helpers */
 
 /**
@@ -436,6 +456,12 @@ hantro_get_dec_buf_addr(struct hantro_ctx *ctx, struct vb2_buffer *vb)
 	return vb2_dma_contig_plane_dma_addr(vb, 0);
 }
 
+static inline struct hantro_decoded_buffer *
+vb2_to_hantro_decoded_buf(struct vb2_buffer *buf)
+{
+	return container_of(buf, struct hantro_decoded_buffer, base.vb.vb2_buf);
+}
+
 void hantro_postproc_disable(struct hantro_ctx *ctx);
 void hantro_postproc_enable(struct hantro_ctx *ctx);
 void hantro_postproc_free(struct hantro_ctx *ctx);
diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
index e8eee117d97f..2d5bcce0ff28 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -232,7 +232,7 @@ queue_init(void *priv, struct vb2_queue *src_vq, struct vb2_queue *dst_vq)
 	dst_vq->io_modes = VB2_MMAP | VB2_DMABUF;
 	dst_vq->drv_priv = ctx;
 	dst_vq->ops = &hantro_queue_ops;
-	dst_vq->buf_struct_size = sizeof(struct v4l2_m2m_buffer);
+	dst_vq->buf_struct_size = sizeof(struct hantro_decoded_buffer);
 	dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
 	dst_vq->lock = &ctx->dev->vpu_mutex;
 	dst_vq->dev = ctx->dev->v4l2_dev.dev;
@@ -266,6 +266,12 @@ static int hantro_try_ctrl(struct v4l2_ctrl *ctrl)
 		if (sps->flags & V4L2_HEVC_SPS_FLAG_SCALING_LIST_ENABLED)
 			/* No scaling support */
 			return -EINVAL;
+	} else if (ctrl->id == V4L2_CID_STATELESS_VP9_FRAME) {
+		const struct v4l2_ctrl_vp9_frame *dec_params = ctrl->p_new.p_vp9_frame;
+
+		/* We only support profile 0 */
+		if (dec_params->profile != 0)
+			return -EINVAL;
 	}
 	return 0;
 }
@@ -459,6 +465,16 @@ static const struct hantro_ctrl controls[] = {
 			.step = 1,
 			.ops = &hantro_hevc_ctrl_ops,
 		},
+	}, {
+		.codec = HANTRO_VP9_DECODER,
+		.cfg = {
+			.id = V4L2_CID_STATELESS_VP9_FRAME,
+		},
+	}, {
+		.codec = HANTRO_VP9_DECODER,
+		.cfg = {
+			.id = V4L2_CID_STATELESS_VP9_COMPRESSED_HDR,
+		},
 	},
 };
 
diff --git a/drivers/staging/media/hantro/hantro_g2_regs.h b/drivers/staging/media/hantro/hantro_g2_regs.h
index 136ba6d98a1f..9c857dd1ad9b 100644
--- a/drivers/staging/media/hantro/hantro_g2_regs.h
+++ b/drivers/staging/media/hantro/hantro_g2_regs.h
@@ -28,6 +28,7 @@
 #define G2_REG_INTERRUPT_DEC_E		BIT(0)
 
 #define HEVC_DEC_MODE			0xc
+#define VP9_DEC_MODE			0xd
 
 #define BUS_WIDTH_32			0
 #define BUS_WIDTH_64			1
@@ -49,6 +50,7 @@
 #define g2_pic_height_in_cbs	G2_DEC_REG(4, 6,  0x1fff)
 #define g2_num_ref_frames	G2_DEC_REG(4, 0,  0x1f)
 
+#define g2_start_bit		G2_DEC_REG(5, 25, 0x7f)
 #define g2_scaling_list_e	G2_DEC_REG(5, 24, 0x1)
 #define g2_cb_qp_offset		G2_DEC_REG(5, 19, 0x1f)
 #define g2_cr_qp_offset		G2_DEC_REG(5, 14, 0x1f)
@@ -84,6 +86,7 @@
 #define g2_bit_depth_y_minus8	G2_DEC_REG(8, 6,  0x3)
 #define g2_bit_depth_c_minus8	G2_DEC_REG(8, 4,  0x3)
 #define g2_output_8_bits	G2_DEC_REG(8, 3,  0x1)
+#define g2_output_format	G2_DEC_REG(8, 0,  0x7)
 
 #define g2_refidx1_active	G2_DEC_REG(9, 19, 0x1f)
 #define g2_refidx0_active	G2_DEC_REG(9, 14, 0x1f)
@@ -96,6 +99,14 @@
 #define g2_tile_e		G2_DEC_REG(10, 1,  0x1)
 #define g2_entropy_sync_e	G2_DEC_REG(10, 0,  0x1)
 
+#define vp9_transform_mode	G2_DEC_REG(11, 27, 0x7)
+#define vp9_filt_sharpness	G2_DEC_REG(11, 21, 0x7)
+#define vp9_mcomp_filt_type	G2_DEC_REG(11,  8, 0x7)
+#define vp9_high_prec_mv_e	G2_DEC_REG(11,  7, 0x1)
+#define vp9_comp_pred_mode	G2_DEC_REG(11,  4, 0x3)
+#define vp9_gref_sign_bias	G2_DEC_REG(11,  2, 0x1)
+#define vp9_aref_sign_bias	G2_DEC_REG(11,  0, 0x1)
+
 #define g2_refer_lterm_e	G2_DEC_REG(12, 16, 0xffff)
 #define g2_min_cb_size		G2_DEC_REG(12, 13, 0x7)
 #define g2_max_cb_size		G2_DEC_REG(12, 10, 0x7)
@@ -154,6 +165,50 @@
 #define g2_partial_ctb_y	G2_DEC_REG(20, 30, 0x1)
 #define g2_pic_width_4x4	G2_DEC_REG(20, 16, 0xfff)
 #define g2_pic_height_4x4	G2_DEC_REG(20, 0,  0xfff)
+
+#define vp9_qp_delta_y_dc	G2_DEC_REG(13, 23, 0x3f)
+#define vp9_qp_delta_ch_dc	G2_DEC_REG(13, 17, 0x3f)
+#define vp9_qp_delta_ch_ac	G2_DEC_REG(13, 11, 0x3f)
+#define vp9_last_sign_bias	G2_DEC_REG(13, 10, 0x1)
+#define vp9_lossless_e		G2_DEC_REG(13,  9, 0x1)
+#define vp9_comp_pred_var_ref1	G2_DEC_REG(13,  7, 0x3)
+#define vp9_comp_pred_var_ref0	G2_DEC_REG(13,  5, 0x3)
+#define vp9_comp_pred_fixed_ref	G2_DEC_REG(13,  3, 0x3)
+#define vp9_segment_temp_upd_e	G2_DEC_REG(13,  2, 0x1)
+#define vp9_segment_upd_e	G2_DEC_REG(13,  1, 0x1)
+#define vp9_segment_e		G2_DEC_REG(13,  0, 0x1)
+
+#define vp9_filt_level		G2_DEC_REG(14, 18, 0x3f)
+#define vp9_refpic_seg0		G2_DEC_REG(14, 15, 0x7)
+#define vp9_skip_seg0		G2_DEC_REG(14, 14, 0x1)
+#define vp9_filt_level_seg0	G2_DEC_REG(14,  8, 0x3f)
+#define vp9_quant_seg0		G2_DEC_REG(14,  0, 0xff)
+
+#define vp9_refpic_seg1		G2_DEC_REG(15, 15, 0x7)
+#define vp9_skip_seg1		G2_DEC_REG(15, 14, 0x1)
+#define vp9_filt_level_seg1	G2_DEC_REG(15,  8, 0x3f)
+#define vp9_quant_seg1		G2_DEC_REG(15,  0, 0xff)
+
+#define vp9_refpic_seg2		G2_DEC_REG(16, 15, 0x7)
+#define vp9_skip_seg2		G2_DEC_REG(16, 14, 0x1)
+#define vp9_filt_level_seg2	G2_DEC_REG(16,  8, 0x3f)
+#define vp9_quant_seg2		G2_DEC_REG(16,  0, 0xff)
+
+#define vp9_refpic_seg3		G2_DEC_REG(17, 15, 0x7)
+#define vp9_skip_seg3		G2_DEC_REG(17, 14, 0x1)
+#define vp9_filt_level_seg3	G2_DEC_REG(17,  8, 0x3f)
+#define vp9_quant_seg3		G2_DEC_REG(17,  0, 0xff)
+
+#define vp9_refpic_seg4		G2_DEC_REG(18, 15, 0x7)
+#define vp9_skip_seg4		G2_DEC_REG(18, 14, 0x1)
+#define vp9_filt_level_seg4	G2_DEC_REG(18,  8, 0x3f)
+#define vp9_quant_seg4		G2_DEC_REG(18,  0, 0xff)
+
+#define vp9_refpic_seg5		G2_DEC_REG(19, 15, 0x7)
+#define vp9_skip_seg5		G2_DEC_REG(19, 14, 0x1)
+#define vp9_filt_level_seg5	G2_DEC_REG(19,  8, 0x3f)
+#define vp9_quant_seg5		G2_DEC_REG(19,  0, 0xff)
+
 #define hevc_cur_poc_00		G2_DEC_REG(46, 24, 0xff)
 #define hevc_cur_poc_01		G2_DEC_REG(46, 16, 0xff)
 #define hevc_cur_poc_02		G2_DEC_REG(46, 8,  0xff)
@@ -174,6 +229,44 @@
 #define hevc_cur_poc_14		G2_DEC_REG(49, 8,  0xff)
 #define hevc_cur_poc_15		G2_DEC_REG(49, 0,  0xff)
 
+#define vp9_refpic_seg6		G2_DEC_REG(31, 15, 0x7)
+#define vp9_skip_seg6		G2_DEC_REG(31, 14, 0x1)
+#define vp9_filt_level_seg6	G2_DEC_REG(31,  8, 0x3f)
+#define vp9_quant_seg6		G2_DEC_REG(31,  0, 0xff)
+
+#define vp9_refpic_seg7		G2_DEC_REG(32, 15, 0x7)
+#define vp9_skip_seg7		G2_DEC_REG(32, 14, 0x1)
+#define vp9_filt_level_seg7	G2_DEC_REG(32,  8, 0x3f)
+#define vp9_quant_seg7		G2_DEC_REG(32,  0, 0xff)
+
+#define vp9_lref_width		G2_DEC_REG(33, 16, 0xffff)
+#define vp9_lref_height		G2_DEC_REG(33,  0, 0xffff)
+
+#define vp9_gref_width		G2_DEC_REG(34, 16, 0xffff)
+#define vp9_gref_height		G2_DEC_REG(34,  0, 0xffff)
+
+#define vp9_aref_width		G2_DEC_REG(35, 16, 0xffff)
+#define vp9_aref_height		G2_DEC_REG(35,  0, 0xffff)
+
+#define vp9_lref_hor_scale	G2_DEC_REG(36, 16, 0xffff)
+#define vp9_lref_ver_scale	G2_DEC_REG(36,  0, 0xffff)
+
+#define vp9_gref_hor_scale	G2_DEC_REG(37, 16, 0xffff)
+#define vp9_gref_ver_scale	G2_DEC_REG(37,  0, 0xffff)
+
+#define vp9_aref_hor_scale	G2_DEC_REG(38, 16, 0xffff)
+#define vp9_aref_ver_scale	G2_DEC_REG(38,  0, 0xffff)
+
+#define vp9_filt_ref_adj_0	G2_DEC_REG(46, 24, 0x7f)
+#define vp9_filt_ref_adj_1	G2_DEC_REG(46, 16, 0x7f)
+#define vp9_filt_ref_adj_2	G2_DEC_REG(46,  8, 0x7f)
+#define vp9_filt_ref_adj_3	G2_DEC_REG(46,  0, 0x7f)
+
+#define vp9_filt_mb_adj_0	G2_DEC_REG(47, 24, 0x7f)
+#define vp9_filt_mb_adj_1	G2_DEC_REG(47, 16, 0x7f)
+#define vp9_filt_mb_adj_2	G2_DEC_REG(47,  8, 0x7f)
+#define vp9_filt_mb_adj_3	G2_DEC_REG(47,  0, 0x7f)
+
 #define g2_apf_threshold	G2_DEC_REG(55, 0, 0xffff)
 
 #define g2_clk_gate_e		G2_DEC_REG(58, 16, 0x1)
@@ -186,6 +279,8 @@
 
 #define G2_OUT_LUMA_ADDR		(G2_SWREG(65))
 #define G2_REF_LUMA_ADDR(i)		(G2_SWREG(67)  + ((i) * 0x8))
+#define G2_VP9_SEGMENT_WRITE_ADDR	(G2_SWREG(79))
+#define G2_VP9_SEGMENT_READ_ADDR	(G2_SWREG(81))
 #define G2_OUT_CHROMA_ADDR		(G2_SWREG(99))
 #define G2_REF_CHROMA_ADDR(i)		(G2_SWREG(101) + ((i) * 0x8))
 #define G2_OUT_MV_ADDR			(G2_SWREG(133))
@@ -193,6 +288,8 @@
 #define G2_TILE_SIZES_ADDR		(G2_SWREG(167))
 #define G2_STREAM_ADDR			(G2_SWREG(169))
 #define G2_HEVC_SCALING_LIST_ADDR	(G2_SWREG(171))
+#define G2_VP9_CTX_COUNT_ADDR		(G2_SWREG(171))
+#define G2_VP9_PROBS_ADDR		(G2_SWREG(173))
 #define G2_RS_OUT_LUMA_ADDR		(G2_SWREG(175))
 #define G2_RS_OUT_CHROMA_ADDR		(G2_SWREG(177))
 #define G2_TILE_FILTER_ADDR		(G2_SWREG(179))
diff --git a/drivers/staging/media/hantro/hantro_g2_vp9_dec.c b/drivers/staging/media/hantro/hantro_g2_vp9_dec.c
new file mode 100644
index 000000000000..7f827b9f0133
--- /dev/null
+++ b/drivers/staging/media/hantro/hantro_g2_vp9_dec.c
@@ -0,0 +1,980 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Hantro VP9 codec driver
+ *
+ * Copyright (C) 2021 Collabora Ltd.
+ */
+#include "media/videobuf2-core.h"
+#include "media/videobuf2-dma-contig.h"
+#include "media/videobuf2-v4l2.h"
+#include <linux/kernel.h>
+#include <linux/vmalloc.h>
+#include <media/v4l2-mem2mem.h>
+#include <media/v4l2-vp9.h>
+
+#include "hantro.h"
+#include "hantro_vp9.h"
+#include "hantro_g2_regs.h"
+
+#define G2_ALIGN 16
+
+enum hantro_ref_frames {
+	INTRA_FRAME = 0,
+	LAST_FRAME = 1,
+	GOLDEN_FRAME = 2,
+	ALTREF_FRAME = 3,
+	MAX_REF_FRAMES = 4
+};
+
+static int start_prepare_run(struct hantro_ctx *ctx, const struct v4l2_ctrl_vp9_frame **dec_params)
+{
+	const struct v4l2_ctrl_vp9_compressed_hdr *prob_updates;
+	struct hantro_vp9_dec_hw_ctx *vp9_ctx = &ctx->vp9_dec;
+	struct v4l2_ctrl *ctrl;
+	unsigned int fctx_idx;
+
+	/* v4l2-specific stuff */
+	hantro_start_prepare_run(ctx);
+
+	ctrl = v4l2_ctrl_find(&ctx->ctrl_handler, V4L2_CID_STATELESS_VP9_FRAME);
+	if (WARN_ON(!ctrl))
+		return -EINVAL;
+	*dec_params = ctrl->p_cur.p;
+
+	ctrl = v4l2_ctrl_find(&ctx->ctrl_handler, V4L2_CID_STATELESS_VP9_COMPRESSED_HDR);
+	if (WARN_ON(!ctrl))
+		return -EINVAL;
+	prob_updates = ctrl->p_cur.p;
+	vp9_ctx->cur.tx_mode = prob_updates->tx_mode;
+
+	/*
+	 * vp9 stuff
+	 *
+	 * by this point the userspace has done all parts of 6.2 uncompressed_header()
+	 * except this fragment:
+	 * if ( FrameIsIntra || error_resilient_mode ) {
+	 *	setup_past_independence ( )
+	 *	if ( frame_type == KEY_FRAME || error_resilient_mode == 1 ||
+	 *	     reset_frame_context == 3 ) {
+	 *		for ( i = 0; i < 4; i ++ ) {
+	 *			save_probs( i )
+	 *		}
+	 *	} else if ( reset_frame_context == 2 ) {
+	 *		save_probs( frame_context_idx )
+	 *	}
+	 *	frame_context_idx = 0
+	 * }
+	 */
+	fctx_idx = v4l2_vp9_reset_frame_ctx(*dec_params, vp9_ctx->frame_context);
+	vp9_ctx->cur.frame_context_idx = fctx_idx;
+
+	/* 6.1 frame(sz): load_probs() and load_probs2() */
+	vp9_ctx->probability_tables = vp9_ctx->frame_context[fctx_idx];
+
+	/*
+	 * The userspace has also performed 6.3 compressed_header(), but handling the
+	 * probs in a special way. All probs which need updating, except MV-related,
+	 * have been read from the bitstream and translated through inv_map_table[],
+	 * but no 6.3.6 inv_recenter_nonneg(v, m) has been performed. The values passed
+	 * by userspace are either translated values (there are no 0 values in
+	 * inv_map_table[]), or zero to indicate no update. All MV-related probs which need
+	 * updating have been read from the bitstream and (mv_prob << 1) | 1 has been
+	 * performed. The values passed by userspace are either new values
+	 * to replace old ones (the above mentioned shift and bitwise or never result in
+	 * a zero) or zero to indicate no update.
+	 * fw_update_probs() performs actual probs updates or leaves probs as-is
+	 * for values for which a zero was passed from userspace.
+	 */
+	v4l2_vp9_fw_update_probs(&vp9_ctx->probability_tables, prob_updates, *dec_params);
+
+	return 0;
+}
+
+static size_t chroma_offset(const struct hantro_ctx *ctx,
+			    const struct v4l2_ctrl_vp9_frame *dec_params)
+{
+	int bytes_per_pixel = dec_params->bit_depth == 8 ? 1 : 2;
+
+	return ctx->src_fmt.width * ctx->src_fmt.height * bytes_per_pixel;
+}
+
+static size_t mv_offset(const struct hantro_ctx *ctx,
+			const struct v4l2_ctrl_vp9_frame *dec_params)
+{
+	size_t cr_offset = chroma_offset(ctx, dec_params);
+
+	return ALIGN((cr_offset * 3) / 2, G2_ALIGN);
+}
+
+static struct hantro_decoded_buffer *
+get_ref_buf(struct hantro_ctx *ctx, struct vb2_v4l2_buffer *dst, u64 timestamp)
+{
+	struct v4l2_m2m_ctx *m2m_ctx = ctx->fh.m2m_ctx;
+	struct vb2_queue *cap_q = &m2m_ctx->cap_q_ctx.q;
+	int buf_idx;
+
+	/*
+	 * If a ref is unused or invalid, address of current destination
+	 * buffer is returned.
+	 */
+	buf_idx = vb2_find_timestamp(cap_q, timestamp, 0);
+	if (buf_idx < 0)
+		return vb2_to_hantro_decoded_buf(&dst->vb2_buf);
+
+	return vb2_to_hantro_decoded_buf(vb2_get_buffer(cap_q, buf_idx));
+}
+
+static void update_dec_buf_info(struct hantro_decoded_buffer *buf,
+				const struct v4l2_ctrl_vp9_frame *dec_params)
+{
+	buf->vp9.width = dec_params->frame_width_minus_1 + 1;
+	buf->vp9.height = dec_params->frame_height_minus_1 + 1;
+	buf->vp9.bit_depth = dec_params->bit_depth;
+}
+
+static void update_ctx_cur_info(struct hantro_vp9_dec_hw_ctx *vp9_ctx,
+				struct hantro_decoded_buffer *buf,
+				const struct v4l2_ctrl_vp9_frame *dec_params)
+{
+	vp9_ctx->cur.valid = true;
+	vp9_ctx->cur.reference_mode = dec_params->reference_mode;
+	vp9_ctx->cur.interpolation_filter = dec_params->interpolation_filter;
+	vp9_ctx->cur.flags = dec_params->flags;
+	vp9_ctx->cur.timestamp = buf->base.vb.vb2_buf.timestamp;
+}
+
+static void config_output(struct hantro_ctx *ctx,
+			  struct hantro_decoded_buffer *dst,
+			  const struct v4l2_ctrl_vp9_frame *dec_params)
+{
+	dma_addr_t luma_addr, chroma_addr, mv_addr;
+
+	hantro_reg_write(ctx->dev, &g2_out_dis, 0);
+	hantro_reg_write(ctx->dev, &g2_output_format, 0);
+
+	luma_addr = vb2_dma_contig_plane_dma_addr(&dst->base.vb.vb2_buf, 0);
+	hantro_write_addr(ctx->dev, G2_OUT_LUMA_ADDR, luma_addr);
+
+	chroma_addr = luma_addr + chroma_offset(ctx, dec_params);
+	hantro_write_addr(ctx->dev, G2_OUT_CHROMA_ADDR, chroma_addr);
+
+	mv_addr = luma_addr + mv_offset(ctx, dec_params);
+	hantro_write_addr(ctx->dev, G2_OUT_MV_ADDR, mv_addr);
+}
+
+struct hantro_vp9_ref_reg {
+	const struct hantro_reg width;
+	const struct hantro_reg height;
+	const struct hantro_reg hor_scale;
+	const struct hantro_reg ver_scale;
+	u32 y_base;
+	u32 c_base;
+};
+
+static void config_ref(struct hantro_ctx *ctx,
+		       struct hantro_decoded_buffer *dst,
+		       const struct hantro_vp9_ref_reg *ref_reg,
+		       const struct v4l2_ctrl_vp9_frame *dec_params,
+		       u64 ref_ts)
+{
+	struct hantro_decoded_buffer *buf;
+	dma_addr_t luma_addr, chroma_addr;
+	u32 refw, refh;
+
+	buf = get_ref_buf(ctx, &dst->base.vb, ref_ts);
+	refw = buf->vp9.width;
+	refh = buf->vp9.height;
+
+	hantro_reg_write(ctx->dev, &ref_reg->width, refw);
+	hantro_reg_write(ctx->dev, &ref_reg->height, refh);
+
+	hantro_reg_write(ctx->dev, &ref_reg->hor_scale, (refw << 14) / dst->vp9.width);
+	hantro_reg_write(ctx->dev, &ref_reg->ver_scale, (refh << 14) / dst->vp9.height);
+
+	luma_addr = vb2_dma_contig_plane_dma_addr(&buf->base.vb.vb2_buf, 0);
+	hantro_write_addr(ctx->dev, ref_reg->y_base, luma_addr);
+
+	chroma_addr = luma_addr + chroma_offset(ctx, dec_params);
+	hantro_write_addr(ctx->dev, ref_reg->c_base, chroma_addr);
+}
+
+static void config_ref_registers(struct hantro_ctx *ctx,
+				 const struct v4l2_ctrl_vp9_frame *dec_params,
+				 struct hantro_decoded_buffer *dst,
+				 struct hantro_decoded_buffer *mv_ref)
+{
+	static const struct hantro_vp9_ref_reg ref_regs[] = {
+		{
+			/* Last */
+			.width = vp9_lref_width,
+			.height = vp9_lref_height,
+			.hor_scale = vp9_lref_hor_scale,
+			.ver_scale = vp9_lref_ver_scale,
+			.y_base = G2_REF_LUMA_ADDR(0),
+			.c_base = G2_REF_CHROMA_ADDR(0),
+		}, {
+			/* Golden */
+			.width = vp9_gref_width,
+			.height = vp9_gref_height,
+			.hor_scale = vp9_gref_hor_scale,
+			.ver_scale = vp9_gref_ver_scale,
+			.y_base = G2_REF_LUMA_ADDR(4),
+			.c_base = G2_REF_CHROMA_ADDR(4),
+		}, {
+			/* Altref */
+			.width = vp9_aref_width,
+			.height = vp9_aref_height,
+			.hor_scale = vp9_aref_hor_scale,
+			.ver_scale = vp9_aref_ver_scale,
+			.y_base = G2_REF_LUMA_ADDR(5),
+			.c_base = G2_REF_CHROMA_ADDR(5),
+		},
+	};
+	dma_addr_t mv_addr;
+
+	config_ref(ctx, dst, &ref_regs[0], dec_params, dec_params->last_frame_ts);
+	config_ref(ctx, dst, &ref_regs[1], dec_params, dec_params->golden_frame_ts);
+	config_ref(ctx, dst, &ref_regs[2], dec_params, dec_params->alt_frame_ts);
+
+	mv_addr = vb2_dma_contig_plane_dma_addr(&mv_ref->base.vb.vb2_buf, 0) +
+		  mv_offset(ctx, dec_params);
+	hantro_write_addr(ctx->dev, G2_REF_MV_ADDR(0), mv_addr);
+
+	hantro_reg_write(ctx->dev, &vp9_last_sign_bias,
+			 dec_params->ref_frame_sign_bias & V4L2_VP9_SIGN_BIAS_LAST ? 1 : 0);
+
+	hantro_reg_write(ctx->dev, &vp9_gref_sign_bias,
+			 dec_params->ref_frame_sign_bias & V4L2_VP9_SIGN_BIAS_GOLDEN ? 1 : 0);
+
+	hantro_reg_write(ctx->dev, &vp9_aref_sign_bias,
+			 dec_params->ref_frame_sign_bias & V4L2_VP9_SIGN_BIAS_ALT ? 1 : 0);
+}
+
+static void recompute_tile_info(unsigned short *tile_info, unsigned int tiles, unsigned int sbs)
+{
+	int i;
+	unsigned int accumulated = 0;
+	unsigned int next_accumulated;
+
+	for (i = 1; i <= tiles; ++i) {
+		next_accumulated = i * sbs / tiles;
+		*tile_info++ = next_accumulated - accumulated;
+		accumulated = next_accumulated;
+	}
+}
+
+static void
+recompute_tile_rc_info(struct hantro_ctx *ctx,
+		       unsigned int tile_r, unsigned int tile_c,
+		       unsigned int sbs_r, unsigned int sbs_c)
+{
+	struct hantro_vp9_dec_hw_ctx *vp9_ctx = &ctx->vp9_dec;
+
+	recompute_tile_info(vp9_ctx->tile_r_info, tile_r, sbs_r);
+	recompute_tile_info(vp9_ctx->tile_c_info, tile_c, sbs_c);
+
+	vp9_ctx->last_tile_r = tile_r;
+	vp9_ctx->last_tile_c = tile_c;
+	vp9_ctx->last_sbs_r = sbs_r;
+	vp9_ctx->last_sbs_c = sbs_c;
+}
+
+static inline unsigned int first_tile_row(unsigned int tile_r, unsigned int sbs_r)
+{
+	if (tile_r == sbs_r + 1)
+		return 1;
+
+	if (tile_r == sbs_r + 2)
+		return 2;
+
+	return 0;
+}
+
+static void
+fill_tile_info(struct hantro_ctx *ctx,
+	       unsigned int tile_r, unsigned int tile_c,
+	       unsigned int sbs_r, unsigned int sbs_c,
+	       unsigned short *tile_mem)
+{
+	struct hantro_vp9_dec_hw_ctx *vp9_ctx = &ctx->vp9_dec;
+	unsigned int i, j;
+	bool first = true;
+
+	for (i = first_tile_row(tile_r, sbs_r); i < tile_r; ++i) {
+		unsigned short r_info = vp9_ctx->tile_r_info[i];
+
+		if (first) {
+			if (i > 0)
+				r_info += vp9_ctx->tile_r_info[0];
+			if (i == 2)
+				r_info += vp9_ctx->tile_r_info[1];
+			first = false;
+		}
+		for (j = 0; j < tile_c; ++j) {
+			*tile_mem++ = vp9_ctx->tile_c_info[j];
+			*tile_mem++ = r_info;
+		}
+	}
+}
+
+static void
+config_tiles(struct hantro_ctx *ctx,
+	     const struct v4l2_ctrl_vp9_frame *dec_params,
+	     struct hantro_decoded_buffer *dst)
+{
+	struct hantro_vp9_dec_hw_ctx *vp9_ctx = &ctx->vp9_dec;
+	struct hantro_aux_buf *misc = &vp9_ctx->misc;
+	struct hantro_aux_buf *tile_edge = &vp9_ctx->tile_edge;
+	dma_addr_t addr;
+	unsigned short *tile_mem;
+
+	addr = misc->dma + vp9_ctx->tile_info_offset;
+	hantro_write_addr(ctx->dev, G2_TILE_SIZES_ADDR, addr);
+
+	tile_mem = misc->cpu + vp9_ctx->tile_info_offset;
+	if (dec_params->tile_cols_log2 || dec_params->tile_rows_log2) {
+		unsigned int tile_r = (1 << dec_params->tile_rows_log2);
+		unsigned int tile_c = (1 << dec_params->tile_cols_log2);
+		unsigned int sbs_r = hantro_vp9_num_sbs(dst->vp9.height);
+		unsigned int sbs_c = hantro_vp9_num_sbs(dst->vp9.width);
+
+		if (tile_r != vp9_ctx->last_tile_r || tile_c != vp9_ctx->last_tile_c ||
+		    sbs_r != vp9_ctx->last_sbs_r || sbs_c != vp9_ctx->last_sbs_c)
+			recompute_tile_rc_info(ctx, tile_r, tile_c, sbs_r, sbs_c);
+
+		fill_tile_info(ctx, tile_r, tile_c, sbs_r, sbs_c, tile_mem);
+
+		hantro_reg_write(ctx->dev, &g2_tile_e, 1);
+		hantro_reg_write(ctx->dev, &g2_num_tile_cols, tile_c);
+		hantro_reg_write(ctx->dev, &g2_num_tile_rows, tile_r);
+
+	} else {
+		tile_mem[0] = hantro_vp9_num_sbs(dst->vp9.width);
+		tile_mem[1] = hantro_vp9_num_sbs(dst->vp9.height);
+
+		hantro_reg_write(ctx->dev, &g2_tile_e, 0);
+		hantro_reg_write(ctx->dev, &g2_num_tile_cols, 1);
+		hantro_reg_write(ctx->dev, &g2_num_tile_rows, 1);
+	}
+
+	/* provide aux buffers even if no tiles are used */
+	addr = tile_edge->dma;
+	hantro_write_addr(ctx->dev, G2_TILE_FILTER_ADDR, addr);
+
+	addr = tile_edge->dma + vp9_ctx->bsd_ctrl_offset;
+	hantro_write_addr(ctx->dev, G2_TILE_BSD_ADDR, addr);
+}
+
+static void
+update_feat_and_flag(struct hantro_vp9_dec_hw_ctx *vp9_ctx,
+		     const struct v4l2_vp9_segmentation *seg,
+		     unsigned int feature,
+		     unsigned int segid)
+{
+	u8 mask = V4L2_VP9_SEGMENT_FEATURE_ENABLED(feature);
+
+	vp9_ctx->feature_data[segid][feature] = seg->feature_data[segid][feature];
+	vp9_ctx->feature_enabled[segid] &= ~mask;
+	vp9_ctx->feature_enabled[segid] |= (seg->feature_enabled[segid] & mask);
+}
+
+static inline s16 clip3(s16 x, s16 y, s16 z)
+{
+	return (z < x) ? x : (z > y) ? y : z;
+}
+
+static s16 feat_val_clip3(s16 feat_val, s16 feature_data, bool absolute, u8 clip)
+{
+	if (absolute)
+		return feature_data;
+
+	return clip3(0, 255, feat_val + feature_data);
+}
+
+static void config_segment(struct hantro_ctx *ctx, const struct v4l2_ctrl_vp9_frame *dec_params)
+{
+	struct hantro_vp9_dec_hw_ctx *vp9_ctx = &ctx->vp9_dec;
+	const struct v4l2_vp9_segmentation *seg;
+	s16 feat_val;
+	unsigned char feat_id;
+	unsigned int segid;
+	bool segment_enabled, absolute, update_data;
+
+	static const struct hantro_reg seg_regs[8][V4L2_VP9_SEG_LVL_MAX] = {
+		{ vp9_quant_seg0, vp9_filt_level_seg0, vp9_refpic_seg0, vp9_skip_seg0 },
+		{ vp9_quant_seg1, vp9_filt_level_seg1, vp9_refpic_seg1, vp9_skip_seg1 },
+		{ vp9_quant_seg2, vp9_filt_level_seg2, vp9_refpic_seg2, vp9_skip_seg2 },
+		{ vp9_quant_seg3, vp9_filt_level_seg3, vp9_refpic_seg3, vp9_skip_seg3 },
+		{ vp9_quant_seg4, vp9_filt_level_seg4, vp9_refpic_seg4, vp9_skip_seg4 },
+		{ vp9_quant_seg5, vp9_filt_level_seg5, vp9_refpic_seg5, vp9_skip_seg5 },
+		{ vp9_quant_seg6, vp9_filt_level_seg6, vp9_refpic_seg6, vp9_skip_seg6 },
+		{ vp9_quant_seg7, vp9_filt_level_seg7, vp9_refpic_seg7, vp9_skip_seg7 },
+	};
+
+	segment_enabled = !!(dec_params->seg.flags & V4L2_VP9_SEGMENTATION_FLAG_ENABLED);
+	hantro_reg_write(ctx->dev, &vp9_segment_e, segment_enabled);
+	hantro_reg_write(ctx->dev, &vp9_segment_upd_e,
+			 !!(dec_params->seg.flags & V4L2_VP9_SEGMENTATION_FLAG_UPDATE_MAP));
+	hantro_reg_write(ctx->dev, &vp9_segment_temp_upd_e,
+			 !!(dec_params->seg.flags & V4L2_VP9_SEGMENTATION_FLAG_TEMPORAL_UPDATE));
+
+	seg = &dec_params->seg;
+	absolute = !!(seg->flags & V4L2_VP9_SEGMENTATION_FLAG_ABS_OR_DELTA_UPDATE);
+	update_data = !!(seg->flags & V4L2_VP9_SEGMENTATION_FLAG_UPDATE_DATA);
+
+	for (segid = 0; segid < 8; ++segid) {
+		/* Quantizer segment feature */
+		feat_id = V4L2_VP9_SEG_LVL_ALT_Q;
+		feat_val = dec_params->quant.base_q_idx;
+		if (segment_enabled) {
+			if (update_data)
+				update_feat_and_flag(vp9_ctx, seg, feat_id, segid);
+			if (v4l2_vp9_seg_feat_enabled(vp9_ctx->feature_enabled, feat_id, segid))
+				feat_val = feat_val_clip3(feat_val,
+							  vp9_ctx->feature_data[segid][feat_id],
+							  absolute, 255);
+		}
+		hantro_reg_write(ctx->dev, &seg_regs[segid][feat_id], feat_val);
+
+		/* Loop filter segment feature */
+		feat_id = V4L2_VP9_SEG_LVL_ALT_L;
+		feat_val = dec_params->lf.level;
+		if (segment_enabled) {
+			if (update_data)
+				update_feat_and_flag(vp9_ctx, seg, feat_id, segid);
+			if (v4l2_vp9_seg_feat_enabled(vp9_ctx->feature_enabled, feat_id, segid))
+				feat_val = feat_val_clip3(feat_val,
+							  vp9_ctx->feature_data[segid][feat_id],
+							  absolute, 63);
+		}
+		hantro_reg_write(ctx->dev, &seg_regs[segid][feat_id], feat_val);
+
+		/* Reference frame segment feature */
+		feat_id = V4L2_VP9_SEG_LVL_REF_FRAME;
+		feat_val = 0;
+		if (segment_enabled) {
+			if (update_data)
+				update_feat_and_flag(vp9_ctx, seg, feat_id, segid);
+			if (!(dec_params->flags & V4L2_VP9_FRAME_FLAG_KEY_FRAME) &&
+			    v4l2_vp9_seg_feat_enabled(vp9_ctx->feature_enabled, feat_id, segid))
+				feat_val = vp9_ctx->feature_data[segid][feat_id] + 1;
+		}
+		hantro_reg_write(ctx->dev, &seg_regs[segid][feat_id], feat_val);
+
+		/* Skip segment feature */
+		feat_id = V4L2_VP9_SEG_LVL_SKIP;
+		feat_val = 0;
+		if (segment_enabled) {
+			if (update_data)
+				update_feat_and_flag(vp9_ctx, seg, feat_id, segid);
+			feat_val = v4l2_vp9_seg_feat_enabled(vp9_ctx->feature_enabled,
+							     feat_id, segid) ? 1 : 0;
+		}
+		hantro_reg_write(ctx->dev, &seg_regs[segid][feat_id], feat_val);
+	}
+}
+
+static void config_loop_filter(struct hantro_ctx *ctx, const struct v4l2_ctrl_vp9_frame *dec_params)
+{
+	bool d = dec_params->lf.flags & V4L2_VP9_LOOP_FILTER_FLAG_DELTA_ENABLED;
+
+	hantro_reg_write(ctx->dev, &vp9_filt_level, dec_params->lf.level);
+	hantro_reg_write(ctx->dev, &g2_out_filtering_dis, dec_params->lf.level == 0);
+	hantro_reg_write(ctx->dev, &vp9_filt_sharpness, dec_params->lf.sharpness);
+
+	hantro_reg_write(ctx->dev, &vp9_filt_ref_adj_0, d ? dec_params->lf.ref_deltas[0] : 0);
+	hantro_reg_write(ctx->dev, &vp9_filt_ref_adj_1, d ? dec_params->lf.ref_deltas[1] : 0);
+	hantro_reg_write(ctx->dev, &vp9_filt_ref_adj_2, d ? dec_params->lf.ref_deltas[2] : 0);
+	hantro_reg_write(ctx->dev, &vp9_filt_ref_adj_3, d ? dec_params->lf.ref_deltas[3] : 0);
+	hantro_reg_write(ctx->dev, &vp9_filt_mb_adj_0, d ? dec_params->lf.mode_deltas[0] : 0);
+	hantro_reg_write(ctx->dev, &vp9_filt_mb_adj_1, d ? dec_params->lf.mode_deltas[1] : 0);
+}
+
+static void config_picture_dimensions(struct hantro_ctx *ctx, struct hantro_decoded_buffer *dst)
+{
+	u32 pic_w_4x4, pic_h_4x4;
+
+	hantro_reg_write(ctx->dev, &g2_pic_width_in_cbs, (dst->vp9.width + 7) / 8);
+	hantro_reg_write(ctx->dev, &g2_pic_height_in_cbs, (dst->vp9.height + 7) / 8);
+	pic_w_4x4 = roundup(dst->vp9.width, 8) >> 2;
+	pic_h_4x4 = roundup(dst->vp9.height, 8) >> 2;
+	hantro_reg_write(ctx->dev, &g2_pic_width_4x4, pic_w_4x4);
+	hantro_reg_write(ctx->dev, &g2_pic_height_4x4, pic_h_4x4);
+}
+
+static void
+config_bit_depth(struct hantro_ctx *ctx, const struct v4l2_ctrl_vp9_frame *dec_params)
+{
+	hantro_reg_write(ctx->dev, &g2_bit_depth_y_minus8, dec_params->bit_depth - 8);
+	hantro_reg_write(ctx->dev, &g2_bit_depth_c_minus8, dec_params->bit_depth - 8);
+}
+
+static inline bool is_lossless(const struct v4l2_vp9_quantization *quant)
+{
+	return quant->base_q_idx == 0 && quant->delta_q_uv_ac == 0 &&
+	       quant->delta_q_uv_dc == 0 && quant->delta_q_y_dc == 0;
+}
+
+static void
+config_quant(struct hantro_ctx *ctx, const struct v4l2_ctrl_vp9_frame *dec_params)
+{
+	hantro_reg_write(ctx->dev, &vp9_qp_delta_y_dc, dec_params->quant.delta_q_y_dc);
+	hantro_reg_write(ctx->dev, &vp9_qp_delta_ch_dc, dec_params->quant.delta_q_uv_dc);
+	hantro_reg_write(ctx->dev, &vp9_qp_delta_ch_ac, dec_params->quant.delta_q_uv_ac);
+	hantro_reg_write(ctx->dev, &vp9_lossless_e, is_lossless(&dec_params->quant));
+}
+
+static u32
+hantro_interp_filter_from_v4l2(unsigned int interpolation_filter)
+{
+	switch (interpolation_filter) {
+	case V4L2_VP9_INTERP_FILTER_EIGHTTAP:
+		return 0x1;
+	case V4L2_VP9_INTERP_FILTER_EIGHTTAP_SMOOTH:
+		return 0;
+	case V4L2_VP9_INTERP_FILTER_EIGHTTAP_SHARP:
+		return 0x2;
+	case V4L2_VP9_INTERP_FILTER_BILINEAR:
+		return 0x3;
+	case V4L2_VP9_INTERP_FILTER_SWITCHABLE:
+		return 0x4;
+	}
+
+	return 0;
+}
+
+static void
+config_others(struct hantro_ctx *ctx, const struct v4l2_ctrl_vp9_frame *dec_params,
+	      bool intra_only, bool resolution_change)
+{
+	struct hantro_vp9_dec_hw_ctx *vp9_ctx = &ctx->vp9_dec;
+
+	hantro_reg_write(ctx->dev, &g2_idr_pic_e, intra_only);
+
+	hantro_reg_write(ctx->dev, &vp9_transform_mode, vp9_ctx->cur.tx_mode);
+
+	hantro_reg_write(ctx->dev, &vp9_mcomp_filt_type, intra_only ?
+		0 : hantro_interp_filter_from_v4l2(dec_params->interpolation_filter));
+
+	hantro_reg_write(ctx->dev, &vp9_high_prec_mv_e,
+			 !!(dec_params->flags & V4L2_VP9_FRAME_FLAG_ALLOW_HIGH_PREC_MV));
+
+	hantro_reg_write(ctx->dev, &vp9_comp_pred_mode, dec_params->reference_mode);
+
+	hantro_reg_write(ctx->dev, &g2_tempor_mvp_e,
+			 !(dec_params->flags & V4L2_VP9_FRAME_FLAG_ERROR_RESILIENT) &&
+			 !(dec_params->flags & V4L2_VP9_FRAME_FLAG_KEY_FRAME) &&
+			 !(vp9_ctx->last.flags & V4L2_VP9_FRAME_FLAG_KEY_FRAME) &&
+			 !(dec_params->flags & V4L2_VP9_FRAME_FLAG_INTRA_ONLY) &&
+			 !resolution_change &&
+			 vp9_ctx->last.flags & V4L2_VP9_FRAME_FLAG_SHOW_FRAME
+	);
+
+	hantro_reg_write(ctx->dev, &g2_write_mvs_e,
+			 !(dec_params->flags & V4L2_VP9_FRAME_FLAG_KEY_FRAME));
+}
+
+static void
+config_compound_reference(struct hantro_ctx *ctx,
+			  const struct v4l2_ctrl_vp9_frame *dec_params)
+{
+	u32 comp_fixed_ref, comp_var_ref[2];
+	bool last_ref_frame_sign_bias;
+	bool golden_ref_frame_sign_bias;
+	bool alt_ref_frame_sign_bias;
+	bool comp_ref_allowed = 0;
+
+	comp_fixed_ref = 0;
+	comp_var_ref[0] = 0;
+	comp_var_ref[1] = 0;
+
+	last_ref_frame_sign_bias = dec_params->ref_frame_sign_bias & V4L2_VP9_SIGN_BIAS_LAST;
+	golden_ref_frame_sign_bias = dec_params->ref_frame_sign_bias & V4L2_VP9_SIGN_BIAS_GOLDEN;
+	alt_ref_frame_sign_bias = dec_params->ref_frame_sign_bias & V4L2_VP9_SIGN_BIAS_ALT;
+
+	/* 6.3.12 Frame reference mode syntax */
+	comp_ref_allowed |= golden_ref_frame_sign_bias != last_ref_frame_sign_bias;
+	comp_ref_allowed |= alt_ref_frame_sign_bias != last_ref_frame_sign_bias;
+
+	if (comp_ref_allowed) {
+		if (last_ref_frame_sign_bias ==
+		    golden_ref_frame_sign_bias) {
+			comp_fixed_ref = ALTREF_FRAME;
+			comp_var_ref[0] = LAST_FRAME;
+			comp_var_ref[1] = GOLDEN_FRAME;
+		} else if (last_ref_frame_sign_bias ==
+			   alt_ref_frame_sign_bias) {
+			comp_fixed_ref = GOLDEN_FRAME;
+			comp_var_ref[0] = LAST_FRAME;
+			comp_var_ref[1] = ALTREF_FRAME;
+		} else {
+			comp_fixed_ref = LAST_FRAME;
+			comp_var_ref[0] = GOLDEN_FRAME;
+			comp_var_ref[1] = ALTREF_FRAME;
+		}
+	}
+
+	hantro_reg_write(ctx->dev, &vp9_comp_pred_fixed_ref, comp_fixed_ref);
+	hantro_reg_write(ctx->dev, &vp9_comp_pred_var_ref0, comp_var_ref[0]);
+	hantro_reg_write(ctx->dev, &vp9_comp_pred_var_ref1, comp_var_ref[1]);
+}
+
+#define INNER_LOOP \
+do {									\
+	for (m = 0; m < ARRAY_SIZE(adaptive->coef[0][0][0][0]); ++m) {	\
+		memcpy(adaptive->coef[i][j][k][l][m],			\
+		       probs->coef[i][j][k][l][m],			\
+		       sizeof(probs->coef[i][j][k][l][m]));		\
+									\
+		adaptive->coef[i][j][k][l][m][3] = 0;			\
+	}								\
+} while (0)
+
+static void config_probs(struct hantro_ctx *ctx, const struct v4l2_ctrl_vp9_frame *dec_params)
+{
+	struct hantro_vp9_dec_hw_ctx *vp9_ctx = &ctx->vp9_dec;
+	struct hantro_aux_buf *misc = &vp9_ctx->misc;
+	struct hantro_g2_all_probs *all_probs = misc->cpu;
+	struct hantro_g2_probs *adaptive;
+	struct hantro_g2_mv_probs *mv;
+	const struct v4l2_vp9_segmentation *seg = &dec_params->seg;
+	const struct v4l2_vp9_frame_context *probs = &vp9_ctx->probability_tables;
+	int i, j, k, l, m;
+
+	for (i = 0; i < ARRAY_SIZE(all_probs->kf_y_mode_prob); ++i)
+		for (j = 0; j < ARRAY_SIZE(all_probs->kf_y_mode_prob[0]); ++j) {
+			memcpy(all_probs->kf_y_mode_prob[i][j],
+			       v4l2_vp9_kf_y_mode_prob[i][j],
+			       ARRAY_SIZE(all_probs->kf_y_mode_prob[i][j]));
+
+			all_probs->kf_y_mode_prob_tail[i][j][0] =
+				v4l2_vp9_kf_y_mode_prob[i][j][8];
+		}
+
+	memcpy(all_probs->mb_segment_tree_probs, seg->tree_probs,
+	       sizeof(all_probs->mb_segment_tree_probs));
+
+	memcpy(all_probs->segment_pred_probs, seg->pred_probs,
+	       sizeof(all_probs->segment_pred_probs));
+
+	for (i = 0; i < ARRAY_SIZE(all_probs->kf_uv_mode_prob); ++i) {
+		memcpy(all_probs->kf_uv_mode_prob[i], v4l2_vp9_kf_uv_mode_prob[i],
+		       ARRAY_SIZE(all_probs->kf_uv_mode_prob[i]));
+
+		all_probs->kf_uv_mode_prob_tail[i][0] = v4l2_vp9_kf_uv_mode_prob[i][8];
+	}
+
+	adaptive = &all_probs->probs;
+
+	for (i = 0; i < ARRAY_SIZE(adaptive->inter_mode); ++i) {
+		memcpy(adaptive->inter_mode[i], probs->inter_mode[i],
+		       sizeof(probs->inter_mode));
+
+		adaptive->inter_mode[i][3] = 0;
+	}
+
+	memcpy(adaptive->is_inter, probs->is_inter, sizeof(adaptive->is_inter));
+
+	for (i = 0; i < ARRAY_SIZE(adaptive->uv_mode); ++i) {
+		memcpy(adaptive->uv_mode[i], probs->uv_mode[i],
+		       sizeof(adaptive->uv_mode[i]));
+		adaptive->uv_mode_tail[i][0] = probs->uv_mode[i][8];
+	}
+
+	memcpy(adaptive->tx8, probs->tx8, sizeof(adaptive->tx8));
+	memcpy(adaptive->tx16, probs->tx16, sizeof(adaptive->tx16));
+	memcpy(adaptive->tx32, probs->tx32, sizeof(adaptive->tx32));
+
+	for (i = 0; i < ARRAY_SIZE(adaptive->y_mode); ++i) {
+		memcpy(adaptive->y_mode[i], probs->y_mode[i],
+		       ARRAY_SIZE(adaptive->y_mode[i]));
+
+		adaptive->y_mode_tail[i][0] = probs->y_mode[i][8];
+	}
+
+	for (i = 0; i < ARRAY_SIZE(adaptive->partition[0]); ++i) {
+		memcpy(adaptive->partition[0][i], v4l2_vp9_kf_partition_probs[i],
+		       sizeof(v4l2_vp9_kf_partition_probs[i]));
+
+		adaptive->partition[0][i][3] = 0;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(adaptive->partition[1]); ++i) {
+		memcpy(adaptive->partition[1][i], probs->partition[i],
+		       sizeof(probs->partition[i]));
+
+		adaptive->partition[1][i][3] = 0;
+	}
+
+	memcpy(adaptive->interp_filter, probs->interp_filter,
+	       sizeof(adaptive->interp_filter));
+
+	memcpy(adaptive->comp_mode, probs->comp_mode, sizeof(adaptive->comp_mode));
+
+	memcpy(adaptive->skip, probs->skip, sizeof(adaptive->skip));
+
+	mv = &adaptive->mv;
+
+	memcpy(mv->joint, probs->mv.joint, sizeof(mv->joint));
+	memcpy(mv->sign, probs->mv.sign, sizeof(mv->sign));
+	memcpy(mv->class0_bit, probs->mv.class0_bit, sizeof(mv->class0_bit));
+	memcpy(mv->fr, probs->mv.fr, sizeof(mv->fr));
+	memcpy(mv->class0_hp, probs->mv.class0_hp, sizeof(mv->class0_hp));
+	memcpy(mv->hp, probs->mv.hp, sizeof(mv->hp));
+	memcpy(mv->classes, probs->mv.classes, sizeof(mv->classes));
+	memcpy(mv->class0_fr, probs->mv.class0_fr, sizeof(mv->class0_fr));
+	memcpy(mv->bits, probs->mv.bits, sizeof(mv->bits));
+
+	memcpy(adaptive->single_ref, probs->single_ref, sizeof(adaptive->single_ref));
+
+	memcpy(adaptive->comp_ref, probs->comp_ref, sizeof(adaptive->comp_ref));
+
+	for (i = 0; i < ARRAY_SIZE(adaptive->coef); ++i)
+		for (j = 0; j < ARRAY_SIZE(adaptive->coef[0]); ++j)
+			for (k = 0; k < ARRAY_SIZE(adaptive->coef[0][0]); ++k)
+				for (l = 0; l < ARRAY_SIZE(adaptive->coef[0][0][0]); ++l)
+					INNER_LOOP;
+
+	hantro_write_addr(ctx->dev, G2_VP9_PROBS_ADDR, misc->dma);
+}
+
+static void config_counts(struct hantro_ctx *ctx)
+{
+	struct hantro_vp9_dec_hw_ctx *vp9_dec = &ctx->vp9_dec;
+	struct hantro_aux_buf *misc = &vp9_dec->misc;
+	dma_addr_t addr = misc->dma + vp9_dec->ctx_counters_offset;
+
+	hantro_write_addr(ctx->dev, G2_VP9_CTX_COUNT_ADDR, addr);
+}
+
+static void config_seg_map(struct hantro_ctx *ctx,
+			   const struct v4l2_ctrl_vp9_frame *dec_params,
+			   bool intra_only, bool update_map)
+{
+	struct hantro_vp9_dec_hw_ctx *vp9_ctx = &ctx->vp9_dec;
+	struct hantro_aux_buf *segment_map = &vp9_ctx->segment_map;
+	dma_addr_t addr;
+
+	if (intra_only ||
+	    (dec_params->flags & V4L2_VP9_FRAME_FLAG_ERROR_RESILIENT)) {
+		memset(segment_map->cpu, 0, segment_map->size);
+		memset(vp9_ctx->feature_data, 0, sizeof(vp9_ctx->feature_data));
+		memset(vp9_ctx->feature_enabled, 0, sizeof(vp9_ctx->feature_enabled));
+	}
+
+	addr = segment_map->dma + vp9_ctx->active_segment * vp9_ctx->segment_map_size;
+	hantro_write_addr(ctx->dev, G2_VP9_SEGMENT_READ_ADDR, addr);
+
+	addr = segment_map->dma + (1 - vp9_ctx->active_segment) * vp9_ctx->segment_map_size;
+	hantro_write_addr(ctx->dev, G2_VP9_SEGMENT_WRITE_ADDR, addr);
+
+	if (update_map)
+		vp9_ctx->active_segment = 1 - vp9_ctx->active_segment;
+}
+
+static void
+config_source(struct hantro_ctx *ctx, const struct v4l2_ctrl_vp9_frame *dec_params,
+	      struct vb2_v4l2_buffer *vb2_src)
+{
+	dma_addr_t stream_base, tmp_addr;
+	unsigned int headres_size;
+	u32 src_len, start_bit, src_buf_len;
+
+	headres_size = dec_params->uncompressed_header_size
+		     + dec_params->compressed_header_size;
+
+	stream_base = vb2_dma_contig_plane_dma_addr(&vb2_src->vb2_buf, 0);
+	hantro_write_addr(ctx->dev, G2_STREAM_ADDR, stream_base);
+
+	tmp_addr = stream_base + headres_size;
+	start_bit = (tmp_addr & 0xf) * 8;
+	hantro_reg_write(ctx->dev, &g2_start_bit, start_bit);
+
+	src_len = vb2_get_plane_payload(&vb2_src->vb2_buf, 0);
+	src_len += start_bit / 8 - headres_size;
+	hantro_reg_write(ctx->dev, &g2_stream_len, src_len);
+
+	tmp_addr &= ~0xf;
+	hantro_reg_write(ctx->dev, &g2_strm_start_offset, tmp_addr - stream_base);
+	src_buf_len = vb2_plane_size(&vb2_src->vb2_buf, 0);
+	hantro_reg_write(ctx->dev, &g2_strm_buffer_len, src_buf_len);
+}
+
+static void
+config_registers(struct hantro_ctx *ctx, const struct v4l2_ctrl_vp9_frame *dec_params,
+		 struct vb2_v4l2_buffer *vb2_src, struct vb2_v4l2_buffer *vb2_dst)
+{
+	struct hantro_decoded_buffer *dst, *last, *mv_ref;
+	struct hantro_vp9_dec_hw_ctx *vp9_ctx = &ctx->vp9_dec;
+	const struct v4l2_vp9_segmentation *seg;
+	bool intra_only, resolution_change;
+
+	/* vp9 stuff */
+	dst = vb2_to_hantro_decoded_buf(&vb2_dst->vb2_buf);
+
+	if (vp9_ctx->last.valid)
+		last = get_ref_buf(ctx, &dst->base.vb, vp9_ctx->last.timestamp);
+	else
+		last = dst;
+
+	update_dec_buf_info(dst, dec_params);
+	update_ctx_cur_info(vp9_ctx, dst, dec_params);
+	seg = &dec_params->seg;
+
+	intra_only = !!(dec_params->flags &
+			(V4L2_VP9_FRAME_FLAG_KEY_FRAME |
+			V4L2_VP9_FRAME_FLAG_INTRA_ONLY));
+
+	if (!intra_only &&
+	    !(dec_params->flags & V4L2_VP9_FRAME_FLAG_ERROR_RESILIENT) &&
+	    vp9_ctx->last.valid)
+		mv_ref = last;
+	else
+		mv_ref = dst;
+
+	resolution_change = dst->vp9.width != last->vp9.width ||
+			    dst->vp9.height != last->vp9.height;
+
+	/* configure basic registers */
+	hantro_reg_write(ctx->dev, &g2_mode, VP9_DEC_MODE);
+	hantro_reg_write(ctx->dev, &g2_strm_swap, 0xf);
+	hantro_reg_write(ctx->dev, &g2_dirmv_swap, 0xf);
+	hantro_reg_write(ctx->dev, &g2_compress_swap, 0xf);
+	hantro_reg_write(ctx->dev, &g2_buswidth, BUS_WIDTH_128);
+	hantro_reg_write(ctx->dev, &g2_max_burst, 16);
+	hantro_reg_write(ctx->dev, &g2_apf_threshold, 8);
+	hantro_reg_write(ctx->dev, &g2_ref_compress_bypass, 1);
+	hantro_reg_write(ctx->dev, &g2_clk_gate_e, 1);
+	hantro_reg_write(ctx->dev, &g2_max_cb_size, 6);
+	hantro_reg_write(ctx->dev, &g2_min_cb_size, 3);
+
+	config_output(ctx, dst, dec_params);
+
+	if (!intra_only)
+		config_ref_registers(ctx, dec_params, dst, mv_ref);
+
+	config_tiles(ctx, dec_params, dst);
+	config_segment(ctx, dec_params);
+	config_loop_filter(ctx, dec_params);
+	config_picture_dimensions(ctx, dst);
+	config_bit_depth(ctx, dec_params);
+	config_quant(ctx, dec_params);
+	config_others(ctx, dec_params, intra_only, resolution_change);
+	config_compound_reference(ctx, dec_params);
+	config_probs(ctx, dec_params);
+	config_counts(ctx);
+	config_seg_map(ctx, dec_params, intra_only,
+		       seg->flags & V4L2_VP9_SEGMENTATION_FLAG_UPDATE_MAP);
+	config_source(ctx, dec_params, vb2_src);
+}
+
+int hantro_g2_vp9_dec_run(struct hantro_ctx *ctx)
+{
+	const struct v4l2_ctrl_vp9_frame *decode_params;
+	struct vb2_v4l2_buffer *src;
+	struct vb2_v4l2_buffer *dst;
+	int ret;
+
+	hantro_g2_check_idle(ctx->dev);
+
+	ret = start_prepare_run(ctx, &decode_params);
+	if (ret) {
+		hantro_end_prepare_run(ctx);
+		return ret;
+	}
+
+	src = hantro_get_src_buf(ctx);
+	dst = hantro_get_dst_buf(ctx);
+
+	config_registers(ctx, decode_params, src, dst);
+
+	hantro_end_prepare_run(ctx);
+
+	vdpu_write(ctx->dev, G2_REG_INTERRUPT_DEC_E, G2_REG_INTERRUPT);
+
+	return 0;
+}
+
+#define copy_tx_and_skip(p1, p2)				\
+do {								\
+	memcpy((p1)->tx8, (p2)->tx8, sizeof((p1)->tx8));	\
+	memcpy((p1)->tx16, (p2)->tx16, sizeof((p1)->tx16));	\
+	memcpy((p1)->tx32, (p2)->tx32, sizeof((p1)->tx32));	\
+	memcpy((p1)->skip, (p2)->skip, sizeof((p1)->skip));	\
+} while (0)
+
+void hantro_g2_vp9_dec_done(struct hantro_ctx *ctx)
+{
+	struct hantro_vp9_dec_hw_ctx *vp9_ctx = &ctx->vp9_dec;
+	unsigned int fctx_idx;
+
+	if (!(vp9_ctx->cur.flags & V4L2_VP9_FRAME_FLAG_REFRESH_FRAME_CTX))
+		goto out_update_last;
+
+	fctx_idx = vp9_ctx->cur.frame_context_idx;
+
+	if (!(vp9_ctx->cur.flags & V4L2_VP9_FRAME_FLAG_PARALLEL_DEC_MODE)) {
+		/* error_resilient_mode == 0 && frame_parallel_decoding_mode == 0 */
+		struct v4l2_vp9_frame_context *probs = &vp9_ctx->probability_tables;
+		bool frame_is_intra = vp9_ctx->cur.flags &
+		    (V4L2_VP9_FRAME_FLAG_KEY_FRAME | V4L2_VP9_FRAME_FLAG_INTRA_ONLY);
+		struct tx_and_skip {
+			u8 tx8[2][1];
+			u8 tx16[2][2];
+			u8 tx32[2][3];
+			u8 skip[3];
+		} _tx_skip, *tx_skip = &_tx_skip;
+		struct v4l2_vp9_frame_symbol_counts *counts;
+		struct symbol_counts *hantro_cnts;
+		u32 tx16p[2][4];
+		int i;
+
+		/* buffer the forward-updated TX and skip probs */
+		if (frame_is_intra)
+			copy_tx_and_skip(tx_skip, probs);
+
+		/* 6.1.2 refresh_probs(): load_probs() and load_probs2() */
+		*probs = vp9_ctx->frame_context[fctx_idx];
+
+		/* if FrameIsIntra then undo the effect of load_probs2() */
+		if (frame_is_intra)
+			copy_tx_and_skip(probs, tx_skip);
+
+		counts = &vp9_ctx->cnts;
+		hantro_cnts = vp9_ctx->misc.cpu + vp9_ctx->ctx_counters_offset;
+		for (i = 0; i < ARRAY_SIZE(tx16p); ++i) {
+			memcpy(tx16p[i],
+			       hantro_cnts->tx16x16_count[i],
+			       sizeof(hantro_cnts->tx16x16_count[0]));
+			tx16p[i][3] = 0;
+		}
+		counts->tx16p = &tx16p;
+
+		v4l2_vp9_adapt_coef_probs(probs, counts,
+					  !vp9_ctx->last.valid ||
+					  vp9_ctx->last.flags & V4L2_VP9_FRAME_FLAG_KEY_FRAME,
+					  frame_is_intra);
+
+		if (!frame_is_intra) {
+			/* load_probs2() already done */
+			u32 mv_mode[7][4];
+
+			for (i = 0; i < ARRAY_SIZE(mv_mode); ++i) {
+				mv_mode[i][0] = hantro_cnts->inter_mode_counts[i][1][0];
+				mv_mode[i][1] = hantro_cnts->inter_mode_counts[i][2][0];
+				mv_mode[i][2] = hantro_cnts->inter_mode_counts[i][0][0];
+				mv_mode[i][3] = hantro_cnts->inter_mode_counts[i][2][1];
+			}
+			counts->mv_mode = &mv_mode;
+			v4l2_vp9_adapt_noncoef_probs(&vp9_ctx->probability_tables, counts,
+						     vp9_ctx->cur.reference_mode,
+						     vp9_ctx->cur.interpolation_filter,
+						     vp9_ctx->cur.tx_mode, vp9_ctx->cur.flags);
+		}
+	}
+
+	vp9_ctx->frame_context[fctx_idx] = vp9_ctx->probability_tables;
+
+out_update_last:
+	vp9_ctx->last = vp9_ctx->cur;
+}
diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
index 42b3f3961f75..2961d399fd60 100644
--- a/drivers/staging/media/hantro/hantro_hw.h
+++ b/drivers/staging/media/hantro/hantro_hw.h
@@ -12,6 +12,7 @@
 #include <linux/interrupt.h>
 #include <linux/v4l2-controls.h>
 #include <media/v4l2-ctrls.h>
+#include <media/v4l2-vp9.h>
 #include <media/videobuf2-core.h>
 
 #define DEC_8190_ALIGN_MASK	0x07U
@@ -161,6 +162,50 @@ struct hantro_vp8_dec_hw_ctx {
 	struct hantro_aux_buf prob_tbl;
 };
 
+struct hantro_vp9_frame_info {
+	u32 valid : 1;
+	u32 frame_context_idx : 2;
+	u32 reference_mode : 2;
+	u32 tx_mode : 3;
+	u32 interpolation_filter : 3;
+	u32 flags;
+	u64 timestamp;
+};
+
+#define MAX_SB_COLS	64
+#define MAX_SB_ROWS	34
+
+/**
+ * struct hantro_vp9_dec_hw_ctx
+ *
+ */
+struct hantro_vp9_dec_hw_ctx {
+	struct hantro_aux_buf tile_edge;
+	struct hantro_aux_buf segment_map;
+	struct hantro_aux_buf misc;
+	struct v4l2_vp9_frame_symbol_counts cnts;
+	struct v4l2_vp9_frame_context probability_tables;
+	struct v4l2_vp9_frame_context frame_context[4];
+	struct hantro_vp9_frame_info cur;
+	struct hantro_vp9_frame_info last;
+
+	unsigned int bsd_ctrl_offset;
+	unsigned int segment_map_size;
+	unsigned int ctx_counters_offset;
+	unsigned int tile_info_offset;
+
+	unsigned short tile_r_info[MAX_SB_ROWS];
+	unsigned short tile_c_info[MAX_SB_COLS];
+	unsigned int last_tile_r;
+	unsigned int last_tile_c;
+	unsigned int last_sbs_r;
+	unsigned int last_sbs_c;
+
+	unsigned int active_segment;
+	u8 feature_enabled[8];
+	s16 feature_data[8][4];
+};
+
 /**
  * struct hantro_postproc_ctx
  *
@@ -267,6 +312,24 @@ void hantro_hevc_ref_remove_unused(struct hantro_ctx *ctx);
 size_t hantro_hevc_chroma_offset(const struct v4l2_ctrl_hevc_sps *sps);
 size_t hantro_hevc_motion_vectors_offset(const struct v4l2_ctrl_hevc_sps *sps);
 
+static inline unsigned short hantro_vp9_num_sbs(unsigned short dimension)
+{
+	return (dimension + 63) / 64;
+}
+
+static inline size_t
+hantro_vp9_mv_size(unsigned int width, unsigned int height)
+{
+	int num_ctbs;
+
+	/*
+	 * There can be up to (CTBs x 64) number of blocks,
+	 * and the motion vector for each block needs 16 bytes.
+	 */
+	num_ctbs = hantro_vp9_num_sbs(width) * hantro_vp9_num_sbs(height);
+	return (num_ctbs * 64) * 16;
+}
+
 static inline size_t
 hantro_h264_mv_size(unsigned int width, unsigned int height)
 {
@@ -308,6 +371,10 @@ void hantro_vp8_dec_exit(struct hantro_ctx *ctx);
 void hantro_vp8_prob_update(struct hantro_ctx *ctx,
 			    const struct v4l2_ctrl_vp8_frame *hdr);
 
+int hantro_g2_vp9_dec_run(struct hantro_ctx *ctx);
+void hantro_g2_vp9_dec_done(struct hantro_ctx *ctx);
+int hantro_vp9_dec_init(struct hantro_ctx *ctx);
+void hantro_vp9_dec_exit(struct hantro_ctx *ctx);
 void hantro_g2_check_idle(struct hantro_dev *vpu);
 
 #endif /* HANTRO_HW_H_ */
diff --git a/drivers/staging/media/hantro/hantro_v4l2.c b/drivers/staging/media/hantro/hantro_v4l2.c
index d1f060c55fed..e4b0645ba6fc 100644
--- a/drivers/staging/media/hantro/hantro_v4l2.c
+++ b/drivers/staging/media/hantro/hantro_v4l2.c
@@ -299,6 +299,11 @@ static int hantro_try_fmt(const struct hantro_ctx *ctx,
 			pix_mp->plane_fmt[0].sizeimage +=
 				hantro_h264_mv_size(pix_mp->width,
 						    pix_mp->height);
+		else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_VP9_FRAME &&
+			 !hantro_needs_postproc(ctx, fmt))
+			pix_mp->plane_fmt[0].sizeimage +=
+				hantro_vp9_mv_size(pix_mp->width,
+						   pix_mp->height);
 	} else if (!pix_mp->plane_fmt[0].sizeimage) {
 		/*
 		 * For coded formats the application can specify
@@ -407,6 +412,7 @@ hantro_update_requires_request(struct hantro_ctx *ctx, u32 fourcc)
 	case V4L2_PIX_FMT_VP8_FRAME:
 	case V4L2_PIX_FMT_H264_SLICE:
 	case V4L2_PIX_FMT_HEVC_SLICE:
+	case V4L2_PIX_FMT_VP9_FRAME:
 		ctx->fh.m2m_ctx->out_q_ctx.q.requires_requests = true;
 		break;
 	default:
diff --git a/drivers/staging/media/hantro/hantro_vp9.c b/drivers/staging/media/hantro/hantro_vp9.c
new file mode 100644
index 000000000000..566cd376c097
--- /dev/null
+++ b/drivers/staging/media/hantro/hantro_vp9.c
@@ -0,0 +1,240 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Hantro VP9 codec driver
+ *
+ * Copyright (C) 2021 Collabora Ltd.
+ */
+
+#include <linux/types.h>
+#include <media/v4l2-mem2mem.h>
+
+#include "hantro.h"
+#include "hantro_hw.h"
+#include "hantro_vp9.h"
+
+#define POW2(x) (1 << (x))
+
+#define MAX_LOG2_TILE_COLUMNS 6
+#define MAX_NUM_TILE_COLS POW2(MAX_LOG2_TILE_COLUMNS)
+#define MAX_TILE_COLS 20
+#define MAX_TILE_ROWS 22
+
+static size_t hantro_vp9_tile_filter_size(unsigned int height)
+{
+	u32 h, height32, size;
+
+	h = roundup(height, 8);
+
+	height32 = roundup(h, 64);
+	size = 24 * height32 * (MAX_NUM_TILE_COLS - 1); /* luma: 8, chroma: 8 + 8 */
+
+	return size;
+}
+
+static size_t hantro_vp9_bsd_control_size(unsigned int height)
+{
+	u32 h, height32;
+
+	h = roundup(height, 8);
+	height32 = roundup(h, 64);
+
+	return 16 * (height32 / 4) * (MAX_NUM_TILE_COLS - 1);
+}
+
+static size_t hantro_vp9_segment_map_size(unsigned int width, unsigned int height)
+{
+	u32 w, h;
+	int num_ctbs;
+
+	w = roundup(width, 8);
+	h = roundup(height, 8);
+	num_ctbs = ((w + 63) / 64) * ((h + 63) / 64);
+
+	return num_ctbs * 32;
+}
+
+static inline size_t hantro_vp9_prob_tab_size(void)
+{
+	return roundup(sizeof(struct hantro_g2_all_probs), 16);
+}
+
+static inline size_t hantro_vp9_count_tab_size(void)
+{
+	return roundup(sizeof(struct symbol_counts), 16);
+}
+
+static inline size_t hantro_vp9_tile_info_size(void)
+{
+	return roundup((MAX_TILE_COLS * MAX_TILE_ROWS * 4 * sizeof(u16) + 15 + 16) & ~0xf, 16);
+}
+
+static void *get_coeffs_arr(struct symbol_counts *cnts, int i, int j, int k, int l, int m)
+{
+	if (i == 0)
+		return &cnts->count_coeffs[j][k][l][m];
+
+	if (i == 1)
+		return &cnts->count_coeffs8x8[j][k][l][m];
+
+	if (i == 2)
+		return &cnts->count_coeffs16x16[j][k][l][m];
+
+	if (i == 3)
+		return &cnts->count_coeffs32x32[j][k][l][m];
+
+	return NULL;
+}
+
+static void *get_eobs1(struct symbol_counts *cnts, int i, int j, int k, int l, int m)
+{
+	if (i == 0)
+		return &cnts->count_coeffs[j][k][l][m][3];
+
+	if (i == 1)
+		return &cnts->count_coeffs8x8[j][k][l][m][3];
+
+	if (i == 2)
+		return &cnts->count_coeffs16x16[j][k][l][m][3];
+
+	if (i == 3)
+		return &cnts->count_coeffs32x32[j][k][l][m][3];
+
+	return NULL;
+}
+
+#define INNER_LOOP \
+	do {										\
+		for (m = 0; m < ARRAY_SIZE(vp9_ctx->cnts.coeff[i][0][0][0]); ++m) {	\
+			vp9_ctx->cnts.coeff[i][j][k][l][m] =				\
+				get_coeffs_arr(cnts, i, j, k, l, m);			\
+			vp9_ctx->cnts.eob[i][j][k][l][m][0] =				\
+				&cnts->count_eobs[i][j][k][l][m];			\
+			vp9_ctx->cnts.eob[i][j][k][l][m][1] =				\
+				get_eobs1(cnts, i, j, k, l, m);				\
+		}									\
+	} while (0)
+
+static void init_v4l2_vp9_count_tbl(struct hantro_ctx *ctx)
+{
+	struct hantro_vp9_dec_hw_ctx *vp9_ctx = &ctx->vp9_dec;
+	struct symbol_counts *cnts = vp9_ctx->misc.cpu + vp9_ctx->ctx_counters_offset;
+	int i, j, k, l, m;
+
+	vp9_ctx->cnts.partition = &cnts->partition_counts;
+	vp9_ctx->cnts.skip = &cnts->mbskip_count;
+	vp9_ctx->cnts.intra_inter = &cnts->intra_inter_count;
+	vp9_ctx->cnts.tx32p = &cnts->tx32x32_count;
+	/*
+	 * g2 hardware uses tx16x16_count[2][3], while the api
+	 * expects tx16p[2][4], so this must be explicitly copied
+	 * into vp9_ctx->cnts.tx16p when passing the data to the
+	 * vp9 library function
+	 */
+	vp9_ctx->cnts.tx8p = &cnts->tx8x8_count;
+
+	vp9_ctx->cnts.y_mode = &cnts->sb_ymode_counts;
+	vp9_ctx->cnts.uv_mode = &cnts->uv_mode_counts;
+	vp9_ctx->cnts.comp = &cnts->comp_inter_count;
+	vp9_ctx->cnts.comp_ref = &cnts->comp_ref_count;
+	vp9_ctx->cnts.single_ref = &cnts->single_ref_count;
+	vp9_ctx->cnts.filter = &cnts->switchable_interp_counts;
+	vp9_ctx->cnts.mv_joint = &cnts->mv_counts.joints;
+	vp9_ctx->cnts.sign = &cnts->mv_counts.sign;
+	vp9_ctx->cnts.classes = &cnts->mv_counts.classes;
+	vp9_ctx->cnts.class0 = &cnts->mv_counts.class0;
+	vp9_ctx->cnts.bits = &cnts->mv_counts.bits;
+	vp9_ctx->cnts.class0_fp = &cnts->mv_counts.class0_fp;
+	vp9_ctx->cnts.fp = &cnts->mv_counts.fp;
+	vp9_ctx->cnts.class0_hp = &cnts->mv_counts.class0_hp;
+	vp9_ctx->cnts.hp = &cnts->mv_counts.hp;
+
+	for (i = 0; i < ARRAY_SIZE(vp9_ctx->cnts.coeff); ++i)
+		for (j = 0; j < ARRAY_SIZE(vp9_ctx->cnts.coeff[i]); ++j)
+			for (k = 0; k < ARRAY_SIZE(vp9_ctx->cnts.coeff[i][0]); ++k)
+				for (l = 0; l < ARRAY_SIZE(vp9_ctx->cnts.coeff[i][0][0]); ++l)
+					INNER_LOOP;
+}
+
+int hantro_vp9_dec_init(struct hantro_ctx *ctx)
+{
+	struct hantro_dev *vpu = ctx->dev;
+	const struct hantro_variant *variant = vpu->variant;
+	struct hantro_vp9_dec_hw_ctx *vp9_dec = &ctx->vp9_dec;
+	struct hantro_aux_buf *tile_edge = &vp9_dec->tile_edge;
+	struct hantro_aux_buf *segment_map = &vp9_dec->segment_map;
+	struct hantro_aux_buf *misc = &vp9_dec->misc;
+	u32 i, max_width, max_height, size;
+
+	if (variant->num_dec_fmts < 1)
+		return -EINVAL;
+
+	for (i = 0; i < variant->num_dec_fmts; ++i)
+		if (variant->dec_fmts[i].fourcc == V4L2_PIX_FMT_VP9_FRAME)
+			break;
+
+	if (i == variant->num_dec_fmts)
+		return -EINVAL;
+
+	max_width = vpu->variant->dec_fmts[i].frmsize.max_width;
+	max_height = vpu->variant->dec_fmts[i].frmsize.max_height;
+
+	size = hantro_vp9_tile_filter_size(max_height);
+	vp9_dec->bsd_ctrl_offset = size;
+	size += hantro_vp9_bsd_control_size(max_height);
+
+	tile_edge->cpu = dma_alloc_coherent(vpu->dev, size, &tile_edge->dma, GFP_KERNEL);
+	if (!tile_edge->cpu)
+		return -ENOMEM;
+
+	tile_edge->size = size;
+	memset(tile_edge->cpu, 0, size);
+
+	size = hantro_vp9_segment_map_size(max_width, max_height);
+	vp9_dec->segment_map_size = size;
+	size *= 2; /* we need two areas of this size, used alternately */
+
+	segment_map->cpu = dma_alloc_coherent(vpu->dev, size, &segment_map->dma, GFP_KERNEL);
+	if (!segment_map->cpu)
+		goto err_segment_map;
+
+	segment_map->size = size;
+	memset(segment_map->cpu, 0, size);
+
+	size = hantro_vp9_prob_tab_size();
+	vp9_dec->ctx_counters_offset = size;
+	size += hantro_vp9_count_tab_size();
+	vp9_dec->tile_info_offset = size;
+	size += hantro_vp9_tile_info_size();
+
+	misc->cpu = dma_alloc_coherent(vpu->dev, size, &misc->dma, GFP_KERNEL);
+	if (!misc->cpu)
+		goto err_misc;
+
+	misc->size = size;
+	memset(misc->cpu, 0, size);
+
+	init_v4l2_vp9_count_tbl(ctx);
+
+	return 0;
+
+err_misc:
+	dma_free_coherent(vpu->dev, segment_map->size, segment_map->cpu, segment_map->dma);
+
+err_segment_map:
+	dma_free_coherent(vpu->dev, tile_edge->size, tile_edge->cpu, tile_edge->dma);
+
+	return -ENOMEM;
+}
+
+void hantro_vp9_dec_exit(struct hantro_ctx *ctx)
+{
+	struct hantro_dev *vpu = ctx->dev;
+	struct hantro_vp9_dec_hw_ctx *vp9_dec = &ctx->vp9_dec;
+	struct hantro_aux_buf *tile_edge = &vp9_dec->tile_edge;
+	struct hantro_aux_buf *segment_map = &vp9_dec->segment_map;
+	struct hantro_aux_buf *misc = &vp9_dec->misc;
+
+	dma_free_coherent(vpu->dev, misc->size, misc->cpu, misc->dma);
+	dma_free_coherent(vpu->dev, segment_map->size, segment_map->cpu, segment_map->dma);
+	dma_free_coherent(vpu->dev, tile_edge->size, tile_edge->cpu, tile_edge->dma);
+}
diff --git a/drivers/staging/media/hantro/hantro_vp9.h b/drivers/staging/media/hantro/hantro_vp9.h
new file mode 100644
index 000000000000..c7f4bd3ff8dd
--- /dev/null
+++ b/drivers/staging/media/hantro/hantro_vp9.h
@@ -0,0 +1,103 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Hantro VP9 codec driver
+ *
+ * Copyright (C) 2021 Collabora Ltd.
+ */
+
+struct hantro_g2_mv_probs {
+	u8 joint[3];
+	u8 sign[2];
+	u8 class0_bit[2][1];
+	u8 fr[2][3];
+	u8 class0_hp[2];
+	u8 hp[2];
+	u8 classes[2][10];
+	u8 class0_fr[2][2][3];
+	u8 bits[2][10];
+};
+
+struct hantro_g2_probs {
+	u8 inter_mode[7][4];
+	u8 is_inter[4];
+	u8 uv_mode[10][8];
+	u8 tx8[2][1];
+	u8 tx16[2][2];
+	u8 tx32[2][3];
+	u8 y_mode_tail[4][1];
+	u8 y_mode[4][8];
+	u8 partition[2][16][4]; /* [keyframe][][], [inter][][] */
+	u8 uv_mode_tail[10][1];
+	u8 interp_filter[4][2];
+	u8 comp_mode[5];
+	u8 skip[3];
+
+	u8 pad1[1];
+
+	struct hantro_g2_mv_probs mv;
+
+	u8 single_ref[5][2];
+	u8 comp_ref[5];
+
+	u8 pad2[17];
+
+	u8 coef[4][2][2][6][6][4];
+};
+
+struct hantro_g2_all_probs {
+	u8 kf_y_mode_prob[10][10][8];
+
+	u8 kf_y_mode_prob_tail[10][10][1];
+	u8 ref_pred_probs[3];
+	u8 mb_segment_tree_probs[7];
+	u8 segment_pred_probs[3];
+	u8 ref_scores[4];
+	u8 prob_comppred[2];
+
+	u8 pad1[9];
+
+	u8 kf_uv_mode_prob[10][8];
+	u8 kf_uv_mode_prob_tail[10][1];
+
+	u8 pad2[6];
+
+	struct hantro_g2_probs probs;
+};
+
+struct mv_counts {
+	u32 joints[4];
+	u32 sign[2][2];
+	u32 classes[2][11];
+	u32 class0[2][2];
+	u32 bits[2][10][2];
+	u32 class0_fp[2][2][4];
+	u32 fp[2][4];
+	u32 class0_hp[2][2];
+	u32 hp[2][2];
+};
+
+struct symbol_counts {
+	u32 inter_mode_counts[7][3][2];
+	u32 sb_ymode_counts[4][10];
+	u32 uv_mode_counts[10][10];
+	u32 partition_counts[16][4];
+	u32 switchable_interp_counts[4][3];
+	u32 intra_inter_count[4][2];
+	u32 comp_inter_count[5][2];
+	u32 single_ref_count[5][2][2];
+	u32 comp_ref_count[5][2];
+	u32 tx32x32_count[2][4];
+	u32 tx16x16_count[2][3];
+	u32 tx8x8_count[2][2];
+	u32 mbskip_count[3][2];
+
+	struct mv_counts mv_counts;
+
+	u32 count_coeffs[2][2][6][6][4];
+	u32 count_coeffs8x8[2][2][6][6][4];
+	u32 count_coeffs16x16[2][2][6][6][4];
+	u32 count_coeffs32x32[2][2][6][6][4];
+
+	u32 count_eobs[4][2][2][6][6];
+};
+
diff --git a/drivers/staging/media/hantro/imx8m_vpu_hw.c b/drivers/staging/media/hantro/imx8m_vpu_hw.c
index a40b161e5956..455a107ffb02 100644
--- a/drivers/staging/media/hantro/imx8m_vpu_hw.c
+++ b/drivers/staging/media/hantro/imx8m_vpu_hw.c
@@ -150,6 +150,19 @@ static const struct hantro_fmt imx8m_vpu_g2_dec_fmts[] = {
 			.step_height = MB_DIM,
 		},
 	},
+	{
+		.fourcc = V4L2_PIX_FMT_VP9_FRAME,
+		.codec_mode = HANTRO_MODE_VP9_DEC,
+		.max_depth = 2,
+		.frmsize = {
+			.min_width = 48,
+			.max_width = 3840,
+			.step_width = MB_DIM,
+			.min_height = 48,
+			.max_height = 2160,
+			.step_height = MB_DIM,
+		},
+	},
 };
 
 static irqreturn_t imx8m_vpu_g1_irq(int irq, void *dev_id)
@@ -241,6 +254,13 @@ static const struct hantro_codec_ops imx8mq_vpu_g2_codec_ops[] = {
 		.init = hantro_hevc_dec_init,
 		.exit = hantro_hevc_dec_exit,
 	},
+	[HANTRO_MODE_VP9_DEC] = {
+		.run = hantro_g2_vp9_dec_run,
+		.done = hantro_g2_vp9_dec_done,
+		.reset = imx8m_vpu_g2_reset,
+		.init = hantro_vp9_dec_init,
+		.exit = hantro_vp9_dec_exit,
+	},
 };
 
 /*
@@ -281,7 +301,7 @@ const struct hantro_variant imx8mq_vpu_g2_variant = {
 	.dec_offset = 0x0,
 	.dec_fmts = imx8m_vpu_g2_dec_fmts,
 	.num_dec_fmts = ARRAY_SIZE(imx8m_vpu_g2_dec_fmts),
-	.codec = HANTRO_HEVC_DECODER,
+	.codec = HANTRO_HEVC_DECODER | HANTRO_VP9_DECODER,
 	.codec_ops = imx8mq_vpu_g2_codec_ops,
 	.init = imx8mq_vpu_hw_init,
 	.runtime_resume = imx8mq_runtime_resume,
-- 
2.17.1


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v7 11/11] media: hantro: Support NV12 on the G2 core
  2021-09-29 16:04 [PATCH v7 00/11] VP9 codec V4L2 control interface Andrzej Pietrasiewicz
                   ` (9 preceding siblings ...)
  2021-09-29 16:04 ` [PATCH v7 10/11] media: hantro: Support VP9 on the G2 core Andrzej Pietrasiewicz
@ 2021-09-29 16:04 ` Andrzej Pietrasiewicz
  2021-10-14 17:42   ` Jernej Škrabec
  2021-10-19 17:55 ` [PATCH v7 00/11] VP9 codec V4L2 control interface Ezequiel Garcia
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 37+ messages in thread
From: Andrzej Pietrasiewicz @ 2021-09-29 16:04 UTC (permalink / raw)
  To: linux-media, linux-arm-kernel, linux-kernel, linux-rockchip,
	linux-staging
  Cc: Andrzej Pietrasiewicz, Benjamin Gaignard, Boris Brezillon,
	Ezequiel Garcia, Fabio Estevam, Greg Kroah-Hartman, Hans Verkuil,
	Heiko Stuebner, Jernej Skrabec, Mauro Carvalho Chehab,
	Nicolas Dufresne, NXP Linux Team, Pengutronix Kernel Team,
	Philipp Zabel, Sascha Hauer, Shawn Guo, kernel, Ezequiel Garcia

The G2 decoder block produces NV12 4x4 tiled format (NV12_4L4).
Enable the G2 post-processor block, in order to produce regular NV12.

The logic in hantro_postproc.c is leveraged to take care of allocating
the extra buffers and configure the post-processor, which is
significantly simpler than the one on the G1.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
---
 .../staging/media/hantro/hantro_g2_vp9_dec.c  |  6 ++--
 drivers/staging/media/hantro/hantro_hw.h      |  1 +
 .../staging/media/hantro/hantro_postproc.c    | 31 +++++++++++++++++++
 drivers/staging/media/hantro/imx8m_vpu_hw.c   | 11 +++++++
 4 files changed, 46 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_g2_vp9_dec.c b/drivers/staging/media/hantro/hantro_g2_vp9_dec.c
index 7f827b9f0133..1a26be72c878 100644
--- a/drivers/staging/media/hantro/hantro_g2_vp9_dec.c
+++ b/drivers/staging/media/hantro/hantro_g2_vp9_dec.c
@@ -152,7 +152,7 @@ static void config_output(struct hantro_ctx *ctx,
 	hantro_reg_write(ctx->dev, &g2_out_dis, 0);
 	hantro_reg_write(ctx->dev, &g2_output_format, 0);
 
-	luma_addr = vb2_dma_contig_plane_dma_addr(&dst->base.vb.vb2_buf, 0);
+	luma_addr = hantro_get_dec_buf_addr(ctx, &dst->base.vb.vb2_buf);
 	hantro_write_addr(ctx->dev, G2_OUT_LUMA_ADDR, luma_addr);
 
 	chroma_addr = luma_addr + chroma_offset(ctx, dec_params);
@@ -191,7 +191,7 @@ static void config_ref(struct hantro_ctx *ctx,
 	hantro_reg_write(ctx->dev, &ref_reg->hor_scale, (refw << 14) / dst->vp9.width);
 	hantro_reg_write(ctx->dev, &ref_reg->ver_scale, (refh << 14) / dst->vp9.height);
 
-	luma_addr = vb2_dma_contig_plane_dma_addr(&buf->base.vb.vb2_buf, 0);
+	luma_addr = hantro_get_dec_buf_addr(ctx, &buf->base.vb.vb2_buf);
 	hantro_write_addr(ctx->dev, ref_reg->y_base, luma_addr);
 
 	chroma_addr = luma_addr + chroma_offset(ctx, dec_params);
@@ -236,7 +236,7 @@ static void config_ref_registers(struct hantro_ctx *ctx,
 	config_ref(ctx, dst, &ref_regs[1], dec_params, dec_params->golden_frame_ts);
 	config_ref(ctx, dst, &ref_regs[2], dec_params, dec_params->alt_frame_ts);
 
-	mv_addr = vb2_dma_contig_plane_dma_addr(&mv_ref->base.vb.vb2_buf, 0) +
+	mv_addr = hantro_get_dec_buf_addr(ctx, &mv_ref->base.vb.vb2_buf) +
 		  mv_offset(ctx, dec_params);
 	hantro_write_addr(ctx->dev, G2_REF_MV_ADDR(0), mv_addr);
 
diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
index 2961d399fd60..3d4a5dc1e6d5 100644
--- a/drivers/staging/media/hantro/hantro_hw.h
+++ b/drivers/staging/media/hantro/hantro_hw.h
@@ -274,6 +274,7 @@ extern const struct hantro_variant rk3399_vpu_variant;
 extern const struct hantro_variant sama5d4_vdec_variant;
 
 extern const struct hantro_postproc_ops hantro_g1_postproc_ops;
+extern const struct hantro_postproc_ops hantro_g2_postproc_ops;
 
 extern const u32 hantro_vp8_dec_mc_filter[8][6];
 
diff --git a/drivers/staging/media/hantro/hantro_postproc.c b/drivers/staging/media/hantro/hantro_postproc.c
index 4549aec08feb..79a66d001738 100644
--- a/drivers/staging/media/hantro/hantro_postproc.c
+++ b/drivers/staging/media/hantro/hantro_postproc.c
@@ -11,6 +11,7 @@
 #include "hantro.h"
 #include "hantro_hw.h"
 #include "hantro_g1_regs.h"
+#include "hantro_g2_regs.h"
 
 #define HANTRO_PP_REG_WRITE(vpu, reg_name, val) \
 { \
@@ -99,6 +100,21 @@ static void hantro_postproc_g1_enable(struct hantro_ctx *ctx)
 	HANTRO_PP_REG_WRITE(vpu, display_width, ctx->dst_fmt.width);
 }
 
+static void hantro_postproc_g2_enable(struct hantro_ctx *ctx)
+{
+	struct hantro_dev *vpu = ctx->dev;
+	struct vb2_v4l2_buffer *dst_buf;
+	size_t chroma_offset = ctx->dst_fmt.width * ctx->dst_fmt.height;
+	dma_addr_t dst_dma;
+
+	dst_buf = hantro_get_dst_buf(ctx);
+	dst_dma = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0);
+
+	hantro_write_addr(vpu, G2_RS_OUT_LUMA_ADDR, dst_dma);
+	hantro_write_addr(vpu, G2_RS_OUT_CHROMA_ADDR, dst_dma + chroma_offset);
+	hantro_reg_write(vpu, &g2_out_rs_e, 1);
+}
+
 void hantro_postproc_free(struct hantro_ctx *ctx)
 {
 	struct hantro_dev *vpu = ctx->dev;
@@ -127,6 +143,9 @@ int hantro_postproc_alloc(struct hantro_ctx *ctx)
 	if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_H264_SLICE)
 		buf_size += hantro_h264_mv_size(ctx->dst_fmt.width,
 						ctx->dst_fmt.height);
+	else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_VP9_FRAME)
+		buf_size += hantro_vp9_mv_size(ctx->dst_fmt.width,
+					       ctx->dst_fmt.height);
 
 	for (i = 0; i < num_buffers; ++i) {
 		struct hantro_aux_buf *priv = &ctx->postproc.dec_q[i];
@@ -152,6 +171,13 @@ static void hantro_postproc_g1_disable(struct hantro_ctx *ctx)
 	HANTRO_PP_REG_WRITE_S(vpu, pipeline_en, 0x0);
 }
 
+static void hantro_postproc_g2_disable(struct hantro_ctx *ctx)
+{
+	struct hantro_dev *vpu = ctx->dev;
+
+	hantro_reg_write(vpu, &g2_out_rs_e, 0);
+}
+
 void hantro_postproc_disable(struct hantro_ctx *ctx)
 {
 	struct hantro_dev *vpu = ctx->dev;
@@ -172,3 +198,8 @@ const struct hantro_postproc_ops hantro_g1_postproc_ops = {
 	.enable = hantro_postproc_g1_enable,
 	.disable = hantro_postproc_g1_disable,
 };
+
+const struct hantro_postproc_ops hantro_g2_postproc_ops = {
+	.enable = hantro_postproc_g2_enable,
+	.disable = hantro_postproc_g2_disable,
+};
diff --git a/drivers/staging/media/hantro/imx8m_vpu_hw.c b/drivers/staging/media/hantro/imx8m_vpu_hw.c
index 455a107ffb02..1a43f6fceef9 100644
--- a/drivers/staging/media/hantro/imx8m_vpu_hw.c
+++ b/drivers/staging/media/hantro/imx8m_vpu_hw.c
@@ -132,6 +132,14 @@ static const struct hantro_fmt imx8m_vpu_dec_fmts[] = {
 	},
 };
 
+static const struct hantro_fmt imx8m_vpu_g2_postproc_fmts[] = {
+	{
+		.fourcc = V4L2_PIX_FMT_NV12,
+		.codec_mode = HANTRO_MODE_NONE,
+		.postprocessed = true,
+	},
+};
+
 static const struct hantro_fmt imx8m_vpu_g2_dec_fmts[] = {
 	{
 		.fourcc = V4L2_PIX_FMT_NV12_4L4,
@@ -301,6 +309,9 @@ const struct hantro_variant imx8mq_vpu_g2_variant = {
 	.dec_offset = 0x0,
 	.dec_fmts = imx8m_vpu_g2_dec_fmts,
 	.num_dec_fmts = ARRAY_SIZE(imx8m_vpu_g2_dec_fmts),
+	.postproc_fmts = imx8m_vpu_g2_postproc_fmts,
+	.num_postproc_fmts = ARRAY_SIZE(imx8m_vpu_g2_postproc_fmts),
+	.postproc_ops = &hantro_g2_postproc_ops,
 	.codec = HANTRO_HEVC_DECODER | HANTRO_VP9_DECODER,
 	.codec_ops = imx8mq_vpu_g2_codec_ops,
 	.init = imx8mq_vpu_hw_init,
-- 
2.17.1


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 07/11] media: rkvdec: Add the VP9 backend
  2021-09-29 16:04 ` [PATCH v7 07/11] media: rkvdec: Add the VP9 backend Andrzej Pietrasiewicz
@ 2021-10-08 10:30   ` Chen-Yu Tsai
  2021-10-19 23:24   ` Alex Bee
  1 sibling, 0 replies; 37+ messages in thread
From: Chen-Yu Tsai @ 2021-10-08 10:30 UTC (permalink / raw)
  To: Andrzej Pietrasiewicz
  Cc: Linux Media Mailing List,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE, LKML,
	open list:ARM/Rockchip SoC...,
	linux-staging, Benjamin Gaignard, Boris Brezillon,
	Ezequiel Garcia, Fabio Estevam, Greg Kroah-Hartman, Hans Verkuil,
	Heiko Stuebner, Jernej Skrabec, Mauro Carvalho Chehab,
	Nicolas Dufresne, NXP Linux Team, Pengutronix Kernel Team,
	Philipp Zabel, Sascha Hauer, Shawn Guo, kernel, Ezequiel Garcia,
	Adrian Ratiu

Hi,

On Thu, Sep 30, 2021 at 12:07 AM Andrzej Pietrasiewicz
<andrzej.p@collabora.com> wrote:
>
> From: Boris Brezillon <boris.brezillon@collabora.com>
>
> The Rockchip VDEC supports VP9 profile 0 up to 4096x2304@30fps. Add
> a backend for this new format.
>
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
> Co-developed-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
> Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
> ---
>  drivers/staging/media/rkvdec/Kconfig      |    1 +
>  drivers/staging/media/rkvdec/Makefile     |    2 +-
>  drivers/staging/media/rkvdec/rkvdec-vp9.c | 1078 +++++++++++++++++++++
>  drivers/staging/media/rkvdec/rkvdec.c     |   52 +-
>  drivers/staging/media/rkvdec/rkvdec.h     |   12 +-
>  5 files changed, 1137 insertions(+), 8 deletions(-)
>  create mode 100644 drivers/staging/media/rkvdec/rkvdec-vp9.c
>

[...]

> diff --git a/drivers/staging/media/rkvdec/rkvdec.c b/drivers/staging/media/rkvdec/rkvdec.c
> index 7131156c1f2c..6aa8aca66547 100644
> --- a/drivers/staging/media/rkvdec/rkvdec.c
> +++ b/drivers/staging/media/rkvdec/rkvdec.c

[...]

> @@ -319,7 +354,7 @@ static int rkvdec_s_output_fmt(struct file *file, void *priv,
>         struct v4l2_m2m_ctx *m2m_ctx = ctx->fh.m2m_ctx;
>         const struct rkvdec_coded_fmt_desc *desc;
>         struct v4l2_format *cap_fmt;
> -       struct vb2_queue *peer_vq;
> +       struct vb2_queue *peer_vq, *vq;
>         int ret;
>
>         /*
> @@ -331,6 +366,15 @@ static int rkvdec_s_output_fmt(struct file *file, void *priv,
>         if (vb2_is_busy(peer_vq))
>                 return -EBUSY;
>
> +       /*
> +        * Some codecs like VP9 can contain dynamic resolution changes which
> +        * are currently not supported by the V4L2 API or driver, so return
> +        * an error if userspace tries to reconfigure the output format.
> +        */
> +       vq = v4l2_m2m_get_vq(m2m_ctx, V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE);
> +       if (vb2_is_busy(vq))
> +               return -EINVAL;

This check is already done in rkvdec_s_fmt(), though it returns -EBUSY
instead. And I don't see similar changes to Hantro, so maybe this isn't
an API limitation as described in the comment? My recent patch [1] also
loosens the restrictions on this.

ChenYu

[1] https://lore.kernel.org/linux-media/20211008100423.739462-3-wenst@chromium.org/

> +
>         ret = rkvdec_s_fmt(file, priv, f, rkvdec_try_output_fmt);
>         if (ret)
>                 return ret;

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 11/11] media: hantro: Support NV12 on the G2 core
  2021-09-29 16:04 ` [PATCH v7 11/11] media: hantro: Support NV12 " Andrzej Pietrasiewicz
@ 2021-10-14 17:42   ` Jernej Škrabec
  2021-10-15 17:19     ` Andrzej Pietrasiewicz
  0 siblings, 1 reply; 37+ messages in thread
From: Jernej Škrabec @ 2021-10-14 17:42 UTC (permalink / raw)
  To: linux-media, linux-arm-kernel, linux-kernel, linux-rockchip,
	linux-staging, Andrzej Pietrasiewicz
  Cc: Andrzej Pietrasiewicz, Benjamin Gaignard, Boris Brezillon,
	Ezequiel Garcia, Fabio Estevam, Greg Kroah-Hartman, Hans Verkuil,
	Heiko Stuebner, Mauro Carvalho Chehab, Nicolas Dufresne,
	NXP Linux Team, Pengutronix Kernel Team, Philipp Zabel,
	Sascha Hauer, Shawn Guo, kernel, Ezequiel Garcia

Hi Andrzej!

Dne sreda, 29. september 2021 ob 18:04:39 CEST je Andrzej Pietrasiewicz 
napisal(a):
> The G2 decoder block produces NV12 4x4 tiled format (NV12_4L4).
> Enable the G2 post-processor block, in order to produce regular NV12.
> 
> The logic in hantro_postproc.c is leveraged to take care of allocating
> the extra buffers and configure the post-processor, which is
> significantly simpler than the one on the G1.

Quick summary of discussion on LibreELEC Slack:
When using NV12 format on Allwinner H6 variant of G2 (needs some driver 
changes), I get frames out of order. If I use native NV12 tiled format, frames 
are ordered correctly.

Currently I'm not sure if this is issue with my changes or is this general 
issue.

I would be grateful if anyone can test frame order with and without 
postprocessing enabled on imx8. Take some dynamic video with a lot of short 
scenes. It's pretty obvious when frames are out of order.

However, given that frames themself are correctly decoded and without 
postprocessing in right order, that shouldn't block merging previous patches. 
I tried few different videos and frames were all decoded correctly.

Best regards,
Jernej

> 
> Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
> Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
> ---
>  .../staging/media/hantro/hantro_g2_vp9_dec.c  |  6 ++--
>  drivers/staging/media/hantro/hantro_hw.h      |  1 +
>  .../staging/media/hantro/hantro_postproc.c    | 31 +++++++++++++++++++
>  drivers/staging/media/hantro/imx8m_vpu_hw.c   | 11 +++++++
>  4 files changed, 46 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/staging/media/hantro/hantro_g2_vp9_dec.c b/drivers/
staging/media/hantro/hantro_g2_vp9_dec.c
> index 7f827b9f0133..1a26be72c878 100644
> --- a/drivers/staging/media/hantro/hantro_g2_vp9_dec.c
> +++ b/drivers/staging/media/hantro/hantro_g2_vp9_dec.c
> @@ -152,7 +152,7 @@ static void config_output(struct hantro_ctx *ctx,
>  	hantro_reg_write(ctx->dev, &g2_out_dis, 0);
>  	hantro_reg_write(ctx->dev, &g2_output_format, 0);
>  
> -	luma_addr = vb2_dma_contig_plane_dma_addr(&dst->base.vb.vb2_buf, 
0);
> +	luma_addr = hantro_get_dec_buf_addr(ctx, &dst->base.vb.vb2_buf);
>  	hantro_write_addr(ctx->dev, G2_OUT_LUMA_ADDR, luma_addr);
>  
>  	chroma_addr = luma_addr + chroma_offset(ctx, dec_params);
> @@ -191,7 +191,7 @@ static void config_ref(struct hantro_ctx *ctx,
>  	hantro_reg_write(ctx->dev, &ref_reg->hor_scale, (refw << 14) / 
dst->vp9.width);
>  	hantro_reg_write(ctx->dev, &ref_reg->ver_scale, (refh << 14) / 
dst->vp9.height);
>  
> -	luma_addr = vb2_dma_contig_plane_dma_addr(&buf->base.vb.vb2_buf, 
0);
> +	luma_addr = hantro_get_dec_buf_addr(ctx, &buf->base.vb.vb2_buf);
>  	hantro_write_addr(ctx->dev, ref_reg->y_base, luma_addr);
>  
>  	chroma_addr = luma_addr + chroma_offset(ctx, dec_params);
> @@ -236,7 +236,7 @@ static void config_ref_registers(struct hantro_ctx *ctx,
>  	config_ref(ctx, dst, &ref_regs[1], dec_params, dec_params-
>golden_frame_ts);
>  	config_ref(ctx, dst, &ref_regs[2], dec_params, dec_params-
>alt_frame_ts);
>  
> -	mv_addr = vb2_dma_contig_plane_dma_addr(&mv_ref->base.vb.vb2_buf, 
0) +
> +	mv_addr = hantro_get_dec_buf_addr(ctx, &mv_ref->base.vb.vb2_buf) +
>  		  mv_offset(ctx, dec_params);
>  	hantro_write_addr(ctx->dev, G2_REF_MV_ADDR(0), mv_addr);
>  
> diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/
media/hantro/hantro_hw.h
> index 2961d399fd60..3d4a5dc1e6d5 100644
> --- a/drivers/staging/media/hantro/hantro_hw.h
> +++ b/drivers/staging/media/hantro/hantro_hw.h
> @@ -274,6 +274,7 @@ extern const struct hantro_variant rk3399_vpu_variant;
>  extern const struct hantro_variant sama5d4_vdec_variant;
>  
>  extern const struct hantro_postproc_ops hantro_g1_postproc_ops;
> +extern const struct hantro_postproc_ops hantro_g2_postproc_ops;
>  
>  extern const u32 hantro_vp8_dec_mc_filter[8][6];
>  
> diff --git a/drivers/staging/media/hantro/hantro_postproc.c b/drivers/
staging/media/hantro/hantro_postproc.c
> index 4549aec08feb..79a66d001738 100644
> --- a/drivers/staging/media/hantro/hantro_postproc.c
> +++ b/drivers/staging/media/hantro/hantro_postproc.c
> @@ -11,6 +11,7 @@
>  #include "hantro.h"
>  #include "hantro_hw.h"
>  #include "hantro_g1_regs.h"
> +#include "hantro_g2_regs.h"
>  
>  #define HANTRO_PP_REG_WRITE(vpu, reg_name, val) \
>  { \
> @@ -99,6 +100,21 @@ static void hantro_postproc_g1_enable(struct hantro_ctx 
*ctx)
>  	HANTRO_PP_REG_WRITE(vpu, display_width, ctx->dst_fmt.width);
>  }
>  
> +static void hantro_postproc_g2_enable(struct hantro_ctx *ctx)
> +{
> +	struct hantro_dev *vpu = ctx->dev;
> +	struct vb2_v4l2_buffer *dst_buf;
> +	size_t chroma_offset = ctx->dst_fmt.width * ctx->dst_fmt.height;
> +	dma_addr_t dst_dma;
> +
> +	dst_buf = hantro_get_dst_buf(ctx);
> +	dst_dma = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0);
> +
> +	hantro_write_addr(vpu, G2_RS_OUT_LUMA_ADDR, dst_dma);
> +	hantro_write_addr(vpu, G2_RS_OUT_CHROMA_ADDR, dst_dma + 
chroma_offset);
> +	hantro_reg_write(vpu, &g2_out_rs_e, 1);
> +}
> +
>  void hantro_postproc_free(struct hantro_ctx *ctx)
>  {
>  	struct hantro_dev *vpu = ctx->dev;
> @@ -127,6 +143,9 @@ int hantro_postproc_alloc(struct hantro_ctx *ctx)
>  	if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_H264_SLICE)
>  		buf_size += hantro_h264_mv_size(ctx->dst_fmt.width,
>  						ctx-
>dst_fmt.height);
> +	else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_VP9_FRAME)
> +		buf_size += hantro_vp9_mv_size(ctx->dst_fmt.width,
> +					       ctx-
>dst_fmt.height);
>  
>  	for (i = 0; i < num_buffers; ++i) {
>  		struct hantro_aux_buf *priv = &ctx->postproc.dec_q[i];
> @@ -152,6 +171,13 @@ static void hantro_postproc_g1_disable(struct 
hantro_ctx *ctx)
>  	HANTRO_PP_REG_WRITE_S(vpu, pipeline_en, 0x0);
>  }
>  
> +static void hantro_postproc_g2_disable(struct hantro_ctx *ctx)
> +{
> +	struct hantro_dev *vpu = ctx->dev;
> +
> +	hantro_reg_write(vpu, &g2_out_rs_e, 0);
> +}
> +
>  void hantro_postproc_disable(struct hantro_ctx *ctx)
>  {
>  	struct hantro_dev *vpu = ctx->dev;
> @@ -172,3 +198,8 @@ const struct hantro_postproc_ops hantro_g1_postproc_ops 
= {
>  	.enable = hantro_postproc_g1_enable,
>  	.disable = hantro_postproc_g1_disable,
>  };
> +
> +const struct hantro_postproc_ops hantro_g2_postproc_ops = {
> +	.enable = hantro_postproc_g2_enable,
> +	.disable = hantro_postproc_g2_disable,
> +};
> diff --git a/drivers/staging/media/hantro/imx8m_vpu_hw.c b/drivers/staging/
media/hantro/imx8m_vpu_hw.c
> index 455a107ffb02..1a43f6fceef9 100644
> --- a/drivers/staging/media/hantro/imx8m_vpu_hw.c
> +++ b/drivers/staging/media/hantro/imx8m_vpu_hw.c
> @@ -132,6 +132,14 @@ static const struct hantro_fmt imx8m_vpu_dec_fmts[] = {
>  	},
>  };
>  
> +static const struct hantro_fmt imx8m_vpu_g2_postproc_fmts[] = {
> +	{
> +		.fourcc = V4L2_PIX_FMT_NV12,
> +		.codec_mode = HANTRO_MODE_NONE,
> +		.postprocessed = true,
> +	},
> +};
> +
>  static const struct hantro_fmt imx8m_vpu_g2_dec_fmts[] = {
>  	{
>  		.fourcc = V4L2_PIX_FMT_NV12_4L4,
> @@ -301,6 +309,9 @@ const struct hantro_variant imx8mq_vpu_g2_variant = {
>  	.dec_offset = 0x0,
>  	.dec_fmts = imx8m_vpu_g2_dec_fmts,
>  	.num_dec_fmts = ARRAY_SIZE(imx8m_vpu_g2_dec_fmts),
> +	.postproc_fmts = imx8m_vpu_g2_postproc_fmts,
> +	.num_postproc_fmts = ARRAY_SIZE(imx8m_vpu_g2_postproc_fmts),
> +	.postproc_ops = &hantro_g2_postproc_ops,
>  	.codec = HANTRO_HEVC_DECODER | HANTRO_VP9_DECODER,
>  	.codec_ops = imx8mq_vpu_g2_codec_ops,
>  	.init = imx8mq_vpu_hw_init,
> -- 
> 2.17.1
> 
> 



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 11/11] media: hantro: Support NV12 on the G2 core
  2021-10-14 17:42   ` Jernej Škrabec
@ 2021-10-15 17:19     ` Andrzej Pietrasiewicz
  2021-10-19 16:38       ` Jernej Škrabec
  0 siblings, 1 reply; 37+ messages in thread
From: Andrzej Pietrasiewicz @ 2021-10-15 17:19 UTC (permalink / raw)
  To: Jernej Škrabec, linux-media, linux-arm-kernel, linux-kernel,
	linux-rockchip, linux-staging
  Cc: Benjamin Gaignard, Boris Brezillon, Ezequiel Garcia,
	Fabio Estevam, Greg Kroah-Hartman, Hans Verkuil, Heiko Stuebner,
	Mauro Carvalho Chehab, Nicolas Dufresne, NXP Linux Team,
	Pengutronix Kernel Team, Philipp Zabel, Sascha Hauer, Shawn Guo,
	kernel, Ezequiel Garcia

Hi Jernej,

W dniu 14.10.2021 o 19:42, Jernej Škrabec pisze:
> Hi Andrzej!
> 
> Dne sreda, 29. september 2021 ob 18:04:39 CEST je Andrzej Pietrasiewicz
> napisal(a):
>> The G2 decoder block produces NV12 4x4 tiled format (NV12_4L4).
>> Enable the G2 post-processor block, in order to produce regular NV12.
>>
>> The logic in hantro_postproc.c is leveraged to take care of allocating
>> the extra buffers and configure the post-processor, which is
>> significantly simpler than the one on the G1.
> 
> Quick summary of discussion on LibreELEC Slack:
> When using NV12 format on Allwinner H6 variant of G2 (needs some driver
> changes), I get frames out of order. If I use native NV12 tiled format, frames
> are ordered correctly.
> 
> Currently I'm not sure if this is issue with my changes or is this general
> issue.
> 
> I would be grateful if anyone can test frame order with and without
> postprocessing enabled on imx8. Take some dynamic video with a lot of short
> scenes. It's pretty obvious when frames are out of order.
> 

I checked on imx8 and cannot observe any such artifacts.

Andrzej

> However, given that frames themself are correctly decoded and without
> postprocessing in right order, that shouldn't block merging previous patches.
> I tried few different videos and frames were all decoded correctly.
> 
> Best regards,
> Jernej
> 
>>
>> Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
>> Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
>> ---
>>   .../staging/media/hantro/hantro_g2_vp9_dec.c  |  6 ++--
>>   drivers/staging/media/hantro/hantro_hw.h      |  1 +
>>   .../staging/media/hantro/hantro_postproc.c    | 31 +++++++++++++++++++
>>   drivers/staging/media/hantro/imx8m_vpu_hw.c   | 11 +++++++
>>   4 files changed, 46 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/staging/media/hantro/hantro_g2_vp9_dec.c b/drivers/
> staging/media/hantro/hantro_g2_vp9_dec.c
>> index 7f827b9f0133..1a26be72c878 100644
>> --- a/drivers/staging/media/hantro/hantro_g2_vp9_dec.c
>> +++ b/drivers/staging/media/hantro/hantro_g2_vp9_dec.c
>> @@ -152,7 +152,7 @@ static void config_output(struct hantro_ctx *ctx,
>>   	hantro_reg_write(ctx->dev, &g2_out_dis, 0);
>>   	hantro_reg_write(ctx->dev, &g2_output_format, 0);
>>   
>> -	luma_addr = vb2_dma_contig_plane_dma_addr(&dst->base.vb.vb2_buf,
> 0);
>> +	luma_addr = hantro_get_dec_buf_addr(ctx, &dst->base.vb.vb2_buf);
>>   	hantro_write_addr(ctx->dev, G2_OUT_LUMA_ADDR, luma_addr);
>>   
>>   	chroma_addr = luma_addr + chroma_offset(ctx, dec_params);
>> @@ -191,7 +191,7 @@ static void config_ref(struct hantro_ctx *ctx,
>>   	hantro_reg_write(ctx->dev, &ref_reg->hor_scale, (refw << 14) /
> dst->vp9.width);
>>   	hantro_reg_write(ctx->dev, &ref_reg->ver_scale, (refh << 14) /
> dst->vp9.height);
>>   
>> -	luma_addr = vb2_dma_contig_plane_dma_addr(&buf->base.vb.vb2_buf,
> 0);
>> +	luma_addr = hantro_get_dec_buf_addr(ctx, &buf->base.vb.vb2_buf);
>>   	hantro_write_addr(ctx->dev, ref_reg->y_base, luma_addr);
>>   
>>   	chroma_addr = luma_addr + chroma_offset(ctx, dec_params);
>> @@ -236,7 +236,7 @@ static void config_ref_registers(struct hantro_ctx *ctx,
>>   	config_ref(ctx, dst, &ref_regs[1], dec_params, dec_params-
>> golden_frame_ts);
>>   	config_ref(ctx, dst, &ref_regs[2], dec_params, dec_params-
>> alt_frame_ts);
>>   
>> -	mv_addr = vb2_dma_contig_plane_dma_addr(&mv_ref->base.vb.vb2_buf,
> 0) +
>> +	mv_addr = hantro_get_dec_buf_addr(ctx, &mv_ref->base.vb.vb2_buf) +
>>   		  mv_offset(ctx, dec_params);
>>   	hantro_write_addr(ctx->dev, G2_REF_MV_ADDR(0), mv_addr);
>>   
>> diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/
> media/hantro/hantro_hw.h
>> index 2961d399fd60..3d4a5dc1e6d5 100644
>> --- a/drivers/staging/media/hantro/hantro_hw.h
>> +++ b/drivers/staging/media/hantro/hantro_hw.h
>> @@ -274,6 +274,7 @@ extern const struct hantro_variant rk3399_vpu_variant;
>>   extern const struct hantro_variant sama5d4_vdec_variant;
>>   
>>   extern const struct hantro_postproc_ops hantro_g1_postproc_ops;
>> +extern const struct hantro_postproc_ops hantro_g2_postproc_ops;
>>   
>>   extern const u32 hantro_vp8_dec_mc_filter[8][6];
>>   
>> diff --git a/drivers/staging/media/hantro/hantro_postproc.c b/drivers/
> staging/media/hantro/hantro_postproc.c
>> index 4549aec08feb..79a66d001738 100644
>> --- a/drivers/staging/media/hantro/hantro_postproc.c
>> +++ b/drivers/staging/media/hantro/hantro_postproc.c
>> @@ -11,6 +11,7 @@
>>   #include "hantro.h"
>>   #include "hantro_hw.h"
>>   #include "hantro_g1_regs.h"
>> +#include "hantro_g2_regs.h"
>>   
>>   #define HANTRO_PP_REG_WRITE(vpu, reg_name, val) \
>>   { \
>> @@ -99,6 +100,21 @@ static void hantro_postproc_g1_enable(struct hantro_ctx
> *ctx)
>>   	HANTRO_PP_REG_WRITE(vpu, display_width, ctx->dst_fmt.width);
>>   }
>>   
>> +static void hantro_postproc_g2_enable(struct hantro_ctx *ctx)
>> +{
>> +	struct hantro_dev *vpu = ctx->dev;
>> +	struct vb2_v4l2_buffer *dst_buf;
>> +	size_t chroma_offset = ctx->dst_fmt.width * ctx->dst_fmt.height;
>> +	dma_addr_t dst_dma;
>> +
>> +	dst_buf = hantro_get_dst_buf(ctx);
>> +	dst_dma = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0);
>> +
>> +	hantro_write_addr(vpu, G2_RS_OUT_LUMA_ADDR, dst_dma);
>> +	hantro_write_addr(vpu, G2_RS_OUT_CHROMA_ADDR, dst_dma +
> chroma_offset);
>> +	hantro_reg_write(vpu, &g2_out_rs_e, 1);
>> +}
>> +
>>   void hantro_postproc_free(struct hantro_ctx *ctx)
>>   {
>>   	struct hantro_dev *vpu = ctx->dev;
>> @@ -127,6 +143,9 @@ int hantro_postproc_alloc(struct hantro_ctx *ctx)
>>   	if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_H264_SLICE)
>>   		buf_size += hantro_h264_mv_size(ctx->dst_fmt.width,
>>   						ctx-
>> dst_fmt.height);
>> +	else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_VP9_FRAME)
>> +		buf_size += hantro_vp9_mv_size(ctx->dst_fmt.width,
>> +					       ctx-
>> dst_fmt.height);
>>   
>>   	for (i = 0; i < num_buffers; ++i) {
>>   		struct hantro_aux_buf *priv = &ctx->postproc.dec_q[i];
>> @@ -152,6 +171,13 @@ static void hantro_postproc_g1_disable(struct
> hantro_ctx *ctx)
>>   	HANTRO_PP_REG_WRITE_S(vpu, pipeline_en, 0x0);
>>   }
>>   
>> +static void hantro_postproc_g2_disable(struct hantro_ctx *ctx)
>> +{
>> +	struct hantro_dev *vpu = ctx->dev;
>> +
>> +	hantro_reg_write(vpu, &g2_out_rs_e, 0);
>> +}
>> +
>>   void hantro_postproc_disable(struct hantro_ctx *ctx)
>>   {
>>   	struct hantro_dev *vpu = ctx->dev;
>> @@ -172,3 +198,8 @@ const struct hantro_postproc_ops hantro_g1_postproc_ops
> = {
>>   	.enable = hantro_postproc_g1_enable,
>>   	.disable = hantro_postproc_g1_disable,
>>   };
>> +
>> +const struct hantro_postproc_ops hantro_g2_postproc_ops = {
>> +	.enable = hantro_postproc_g2_enable,
>> +	.disable = hantro_postproc_g2_disable,
>> +};
>> diff --git a/drivers/staging/media/hantro/imx8m_vpu_hw.c b/drivers/staging/
> media/hantro/imx8m_vpu_hw.c
>> index 455a107ffb02..1a43f6fceef9 100644
>> --- a/drivers/staging/media/hantro/imx8m_vpu_hw.c
>> +++ b/drivers/staging/media/hantro/imx8m_vpu_hw.c
>> @@ -132,6 +132,14 @@ static const struct hantro_fmt imx8m_vpu_dec_fmts[] = {
>>   	},
>>   };
>>   
>> +static const struct hantro_fmt imx8m_vpu_g2_postproc_fmts[] = {
>> +	{
>> +		.fourcc = V4L2_PIX_FMT_NV12,
>> +		.codec_mode = HANTRO_MODE_NONE,
>> +		.postprocessed = true,
>> +	},
>> +};
>> +
>>   static const struct hantro_fmt imx8m_vpu_g2_dec_fmts[] = {
>>   	{
>>   		.fourcc = V4L2_PIX_FMT_NV12_4L4,
>> @@ -301,6 +309,9 @@ const struct hantro_variant imx8mq_vpu_g2_variant = {
>>   	.dec_offset = 0x0,
>>   	.dec_fmts = imx8m_vpu_g2_dec_fmts,
>>   	.num_dec_fmts = ARRAY_SIZE(imx8m_vpu_g2_dec_fmts),
>> +	.postproc_fmts = imx8m_vpu_g2_postproc_fmts,
>> +	.num_postproc_fmts = ARRAY_SIZE(imx8m_vpu_g2_postproc_fmts),
>> +	.postproc_ops = &hantro_g2_postproc_ops,
>>   	.codec = HANTRO_HEVC_DECODER | HANTRO_VP9_DECODER,
>>   	.codec_ops = imx8mq_vpu_g2_codec_ops,
>>   	.init = imx8mq_vpu_hw_init,
>> -- 
>> 2.17.1
>>
>>
> 
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Re: [PATCH v7 11/11] media: hantro: Support NV12 on the G2 core
  2021-10-15 17:19     ` Andrzej Pietrasiewicz
@ 2021-10-19 16:38       ` Jernej Škrabec
  2021-10-20 11:06         ` Ezequiel Garcia
  0 siblings, 1 reply; 37+ messages in thread
From: Jernej Škrabec @ 2021-10-19 16:38 UTC (permalink / raw)
  To: linux-media, linux-arm-kernel, linux-kernel, linux-rockchip,
	linux-staging, Andrzej Pietrasiewicz
  Cc: Benjamin Gaignard, Boris Brezillon, Ezequiel Garcia,
	Fabio Estevam, Greg Kroah-Hartman, Hans Verkuil, Heiko Stuebner,
	Mauro Carvalho Chehab, Nicolas Dufresne, NXP Linux Team,
	Pengutronix Kernel Team, Philipp Zabel, Sascha Hauer, Shawn Guo,
	kernel, Ezequiel Garcia

Hi Andrzej!

Dne petek, 15. oktober 2021 ob 19:19:47 CEST je Andrzej Pietrasiewicz 
napisal(a):
> Hi Jernej,
> 
> W dniu 14.10.2021 o 19:42, Jernej Škrabec pisze:
> > Hi Andrzej!
> > 
> > Dne sreda, 29. september 2021 ob 18:04:39 CEST je Andrzej Pietrasiewicz
> > napisal(a):
> >> The G2 decoder block produces NV12 4x4 tiled format (NV12_4L4).
> >> Enable the G2 post-processor block, in order to produce regular NV12.
> >>
> >> The logic in hantro_postproc.c is leveraged to take care of allocating
> >> the extra buffers and configure the post-processor, which is
> >> significantly simpler than the one on the G1.
> > 
> > Quick summary of discussion on LibreELEC Slack:
> > When using NV12 format on Allwinner H6 variant of G2 (needs some driver
> > changes), I get frames out of order. If I use native NV12 tiled format, 
frames
> > are ordered correctly.
> > 
> > Currently I'm not sure if this is issue with my changes or is this general
> > issue.
> > 
> > I would be grateful if anyone can test frame order with and without
> > postprocessing enabled on imx8. Take some dynamic video with a lot of 
short
> > scenes. It's pretty obvious when frames are out of order.
> > 
> 
> I checked on imx8 and cannot observe any such artifacts.

I finally found the issue. As you mentioned on Slack, register write order once 
already affected decoding. Well, it's the case again. I made hacky test and 
moved postproc enable call after output buffers are set and it worked. So, this 
is actually core quirk which is obviously fixed in newer variants.

This makes this series with minor adaptations completely working on H6. I see 
no reason not to merge whole series.

Thanks for testing.

Best regards,
Jernej

> 
> Andrzej
> 
> > However, given that frames themself are correctly decoded and without
> > postprocessing in right order, that shouldn't block merging previous 
patches.
> > I tried few different videos and frames were all decoded correctly.
> > 
> > Best regards,
> > Jernej
> > 
> >>
> >> Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
> >> Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
> >> ---
> >>   .../staging/media/hantro/hantro_g2_vp9_dec.c  |  6 ++--
> >>   drivers/staging/media/hantro/hantro_hw.h      |  1 +
> >>   .../staging/media/hantro/hantro_postproc.c    | 31 +++++++++++++++++++
> >>   drivers/staging/media/hantro/imx8m_vpu_hw.c   | 11 +++++++
> >>   4 files changed, 46 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/staging/media/hantro/hantro_g2_vp9_dec.c b/drivers/
> > staging/media/hantro/hantro_g2_vp9_dec.c
> >> index 7f827b9f0133..1a26be72c878 100644
> >> --- a/drivers/staging/media/hantro/hantro_g2_vp9_dec.c
> >> +++ b/drivers/staging/media/hantro/hantro_g2_vp9_dec.c
> >> @@ -152,7 +152,7 @@ static void config_output(struct hantro_ctx *ctx,
> >>   	hantro_reg_write(ctx->dev, &g2_out_dis, 0);
> >>   	hantro_reg_write(ctx->dev, &g2_output_format, 0);
> >>   
> >> -	luma_addr = vb2_dma_contig_plane_dma_addr(&dst->base.vb.vb2_buf,
> > 0);
> >> +	luma_addr = hantro_get_dec_buf_addr(ctx, &dst->base.vb.vb2_buf);
> >>   	hantro_write_addr(ctx->dev, G2_OUT_LUMA_ADDR, luma_addr);
> >>   
> >>   	chroma_addr = luma_addr + chroma_offset(ctx, dec_params);
> >> @@ -191,7 +191,7 @@ static void config_ref(struct hantro_ctx *ctx,
> >>   	hantro_reg_write(ctx->dev, &ref_reg->hor_scale, (refw << 14) /
> > dst->vp9.width);
> >>   	hantro_reg_write(ctx->dev, &ref_reg->ver_scale, (refh << 14) /
> > dst->vp9.height);
> >>   
> >> -	luma_addr = vb2_dma_contig_plane_dma_addr(&buf->base.vb.vb2_buf,
> > 0);
> >> +	luma_addr = hantro_get_dec_buf_addr(ctx, &buf->base.vb.vb2_buf);
> >>   	hantro_write_addr(ctx->dev, ref_reg->y_base, luma_addr);
> >>   
> >>   	chroma_addr = luma_addr + chroma_offset(ctx, dec_params);
> >> @@ -236,7 +236,7 @@ static void config_ref_registers(struct hantro_ctx 
*ctx,
> >>   	config_ref(ctx, dst, &ref_regs[1], dec_params, dec_params-
> >> golden_frame_ts);
> >>   	config_ref(ctx, dst, &ref_regs[2], dec_params, dec_params-
> >> alt_frame_ts);
> >>   
> >> -	mv_addr = vb2_dma_contig_plane_dma_addr(&mv_ref->base.vb.vb2_buf,
> > 0) +
> >> +	mv_addr = hantro_get_dec_buf_addr(ctx, &mv_ref->base.vb.vb2_buf) +
> >>   		  mv_offset(ctx, dec_params);
> >>   	hantro_write_addr(ctx->dev, G2_REF_MV_ADDR(0), mv_addr);
> >>   
> >> diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/
> > media/hantro/hantro_hw.h
> >> index 2961d399fd60..3d4a5dc1e6d5 100644
> >> --- a/drivers/staging/media/hantro/hantro_hw.h
> >> +++ b/drivers/staging/media/hantro/hantro_hw.h
> >> @@ -274,6 +274,7 @@ extern const struct hantro_variant 
rk3399_vpu_variant;
> >>   extern const struct hantro_variant sama5d4_vdec_variant;
> >>   
> >>   extern const struct hantro_postproc_ops hantro_g1_postproc_ops;
> >> +extern const struct hantro_postproc_ops hantro_g2_postproc_ops;
> >>   
> >>   extern const u32 hantro_vp8_dec_mc_filter[8][6];
> >>   
> >> diff --git a/drivers/staging/media/hantro/hantro_postproc.c b/drivers/
> > staging/media/hantro/hantro_postproc.c
> >> index 4549aec08feb..79a66d001738 100644
> >> --- a/drivers/staging/media/hantro/hantro_postproc.c
> >> +++ b/drivers/staging/media/hantro/hantro_postproc.c
> >> @@ -11,6 +11,7 @@
> >>   #include "hantro.h"
> >>   #include "hantro_hw.h"
> >>   #include "hantro_g1_regs.h"
> >> +#include "hantro_g2_regs.h"
> >>   
> >>   #define HANTRO_PP_REG_WRITE(vpu, reg_name, val) \
> >>   { \
> >> @@ -99,6 +100,21 @@ static void hantro_postproc_g1_enable(struct 
hantro_ctx
> > *ctx)
> >>   	HANTRO_PP_REG_WRITE(vpu, display_width, ctx->dst_fmt.width);
> >>   }
> >>   
> >> +static void hantro_postproc_g2_enable(struct hantro_ctx *ctx)
> >> +{
> >> +	struct hantro_dev *vpu = ctx->dev;
> >> +	struct vb2_v4l2_buffer *dst_buf;
> >> +	size_t chroma_offset = ctx->dst_fmt.width * ctx->dst_fmt.height;
> >> +	dma_addr_t dst_dma;
> >> +
> >> +	dst_buf = hantro_get_dst_buf(ctx);
> >> +	dst_dma = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0);
> >> +
> >> +	hantro_write_addr(vpu, G2_RS_OUT_LUMA_ADDR, dst_dma);
> >> +	hantro_write_addr(vpu, G2_RS_OUT_CHROMA_ADDR, dst_dma +
> > chroma_offset);
> >> +	hantro_reg_write(vpu, &g2_out_rs_e, 1);
> >> +}
> >> +
> >>   void hantro_postproc_free(struct hantro_ctx *ctx)
> >>   {
> >>   	struct hantro_dev *vpu = ctx->dev;
> >> @@ -127,6 +143,9 @@ int hantro_postproc_alloc(struct hantro_ctx *ctx)
> >>   	if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_H264_SLICE)
> >>   		buf_size += hantro_h264_mv_size(ctx->dst_fmt.width,
> >>   						ctx-
> >> dst_fmt.height);
> >> +	else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_VP9_FRAME)
> >> +		buf_size += hantro_vp9_mv_size(ctx->dst_fmt.width,
> >> +					       ctx-
> >> dst_fmt.height);
> >>   
> >>   	for (i = 0; i < num_buffers; ++i) {
> >>   		struct hantro_aux_buf *priv = &ctx->postproc.dec_q[i];
> >> @@ -152,6 +171,13 @@ static void hantro_postproc_g1_disable(struct
> > hantro_ctx *ctx)
> >>   	HANTRO_PP_REG_WRITE_S(vpu, pipeline_en, 0x0);
> >>   }
> >>   
> >> +static void hantro_postproc_g2_disable(struct hantro_ctx *ctx)
> >> +{
> >> +	struct hantro_dev *vpu = ctx->dev;
> >> +
> >> +	hantro_reg_write(vpu, &g2_out_rs_e, 0);
> >> +}
> >> +
> >>   void hantro_postproc_disable(struct hantro_ctx *ctx)
> >>   {
> >>   	struct hantro_dev *vpu = ctx->dev;
> >> @@ -172,3 +198,8 @@ const struct hantro_postproc_ops 
hantro_g1_postproc_ops
> > = {
> >>   	.enable = hantro_postproc_g1_enable,
> >>   	.disable = hantro_postproc_g1_disable,
> >>   };
> >> +
> >> +const struct hantro_postproc_ops hantro_g2_postproc_ops = {
> >> +	.enable = hantro_postproc_g2_enable,
> >> +	.disable = hantro_postproc_g2_disable,
> >> +};
> >> diff --git a/drivers/staging/media/hantro/imx8m_vpu_hw.c b/drivers/
staging/
> > media/hantro/imx8m_vpu_hw.c
> >> index 455a107ffb02..1a43f6fceef9 100644
> >> --- a/drivers/staging/media/hantro/imx8m_vpu_hw.c
> >> +++ b/drivers/staging/media/hantro/imx8m_vpu_hw.c
> >> @@ -132,6 +132,14 @@ static const struct hantro_fmt imx8m_vpu_dec_fmts[] 
= {
> >>   	},
> >>   };
> >>   
> >> +static const struct hantro_fmt imx8m_vpu_g2_postproc_fmts[] = {
> >> +	{
> >> +		.fourcc = V4L2_PIX_FMT_NV12,
> >> +		.codec_mode = HANTRO_MODE_NONE,
> >> +		.postprocessed = true,
> >> +	},
> >> +};
> >> +
> >>   static const struct hantro_fmt imx8m_vpu_g2_dec_fmts[] = {
> >>   	{
> >>   		.fourcc = V4L2_PIX_FMT_NV12_4L4,
> >> @@ -301,6 +309,9 @@ const struct hantro_variant imx8mq_vpu_g2_variant = {
> >>   	.dec_offset = 0x0,
> >>   	.dec_fmts = imx8m_vpu_g2_dec_fmts,
> >>   	.num_dec_fmts = ARRAY_SIZE(imx8m_vpu_g2_dec_fmts),
> >> +	.postproc_fmts = imx8m_vpu_g2_postproc_fmts,
> >> +	.num_postproc_fmts = ARRAY_SIZE(imx8m_vpu_g2_postproc_fmts),
> >> +	.postproc_ops = &hantro_g2_postproc_ops,
> >>   	.codec = HANTRO_HEVC_DECODER | HANTRO_VP9_DECODER,
> >>   	.codec_ops = imx8mq_vpu_g2_codec_ops,
> >>   	.init = imx8mq_vpu_hw_init,
> >> -- 
> >> 2.17.1
> >>
> >>
> > 
> > 
> 
> 



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 00/11] VP9 codec V4L2 control interface
  2021-09-29 16:04 [PATCH v7 00/11] VP9 codec V4L2 control interface Andrzej Pietrasiewicz
                   ` (10 preceding siblings ...)
  2021-09-29 16:04 ` [PATCH v7 11/11] media: hantro: Support NV12 " Andrzej Pietrasiewicz
@ 2021-10-19 17:55 ` Ezequiel Garcia
  2021-11-11 14:44 ` Hans Verkuil
  2021-11-15 15:07 ` Hans Verkuil
  13 siblings, 0 replies; 37+ messages in thread
From: Ezequiel Garcia @ 2021-10-19 17:55 UTC (permalink / raw)
  To: Andrzej Pietrasiewicz, Jernej Skrabec, Nicolas Dufresne, Daniel Almeida
  Cc: linux-media, linux-arm-kernel, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...,
	open list:STAGING SUBSYSTEM, Benjamin Gaignard, Boris Brezillon,
	Fabio Estevam, Greg Kroah-Hartman, Hans Verkuil, Heiko Stuebner,
	Mauro Carvalho Chehab, NXP Linux Team, Pengutronix Kernel Team,
	Philipp Zabel, Sascha Hauer, Shawn Guo, Collabora Kernel ML

Hi everyone,

On Wed, 29 Sept 2021 at 12:04, Andrzej Pietrasiewicz
<andrzej.p@collabora.com> wrote:
>
> Dear all,
>
> This patch series adds VP9 codec V4L2 control interface and two drivers
> using the new controls. It is a follow-up of previous v6 series [1].
>
> In this iteration, we've implemented VP9 hardware decoding on two devices:
> Rockchip VDEC and Hantro G2, and tested on RK3399, i.MX8MQ and i.MX8MP.
> The i.MX8M driver needs proper power domains support, though, which is a
> subject of a different effort, but in all 3 cases we were able to run the
> drivers.
>
> GStreamer support is also available, the needed changes have been submitted
> by Daniel Almeida [2]. This MR is ready to be merged, and just needs the
> VP9 V4L2 controls to be merged and released.
>
> Both rkvdec and hantro drivers are passing a significant number of VP9 tests
> using Fluster[3]. There are still a few tests that are not passing, due to
> dynamic frame resize (not yet supported by V4L2) and small size videos
> (due to IP block limitations).
>
> The series adds the VP9 codec V4L2 control API as uAPI, so it aims at being
> merged without passing through staging, as agreed[4]. The ABI has been checked
> for padding and verified to contain no holes.
>

I took another look at this, and I'm fairly happy with it.

I'd just like to have an A-b or R-b from Nicolas Dufresne and
Daniel Almeida, given they've done a lot of work on the client side
of the API.

Another option would be to wait until Jernej finishes the work on
Allwinner H6, so we have another hardware supported.

Thanks,
Ezequiel

> [1] https://patchwork.linuxtv.org/project/linux-media/list/?series=6377
> [2] https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2144
> [3] https://github.com/fluendo/fluster
> [4] https://lore.kernel.org/linux-media/b8f83c93-67fd-09f5-9314-15746cbfdc61@xs4all.nl/
>
> The series depends on the YUV tiled format support prepared by Ezequiel:
> https://www.spinics.net/lists/linux-media/msg197047.html
>
> Rebased onto latest media_tree.
>
> Changes related to v6:
> - moved setting tile filter and tile bsd auxiliary buffer addresses so
> that they are always set, even if no tiles are used (thanks, Jernej)
> - added a comment near the place where the 32-bit DMA mask is applied
>   (thanks, Nicolas)
> - improved consistency in register names (thanks, Nicolas)
>
> Changes related to v5:
> - improved the doc comments as per Ezequiel's review (thanks, Ezequiel)
> - improved pdf output of documentation
> - added Benjamin's Reviewed-by (thanks, Benjamin)
>
> Changes related to v4:
> - removed unused enum v4l2_vp9_intra_prediction_mode
> - converted remaining enums to defines to follow the convention
> - improved the documentation, in particular better documented how to use segmentation
> features
>
> Changes related to v3:
>
> Apply suggestions from Jernej's review (thanks, Jernej):
> - renamed a control and two structs:
>         V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR_PROBS =>
>                 V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR
>         v4l2_ctrl_vp9_compressed_hdr_probs =>
>                 v4l2_ctrl_vp9_compressed_hdr
>         v4l2_vp9_mv_compressed_hdr_probs => v4l2_vp9_mv_probs
> - moved tx_mode to v4l2_ctrl_vp9_compressed_hdr
> - fixed enum v4l2_vp9_ref_frame_sign_bias values (which are used to test a bitfield)
> - explicitly assigned values to all other vp9 enums
>
> Apply suggestion from Nicolas's review (thanks, Nicolas):
> - explicitly stated that the v4l2_ctrl_vp9_compressed_hdr control is optional
> and implemented only by drivers which need it
>
> Changes related to the RFC v2:
>
> - added another driver including a postprocessor to de-tile
>         codec-specific tiling
> - reworked uAPI structs layout to follow VP8 style
> - changed validation of loop filter params
> - changed validation of segmentation params
> - changed validation of VP9 frame params
> - removed level lookup array from loop filter struct
>         (can be computed by drivers)
> - renamed some enum values to match the spec more closely
> - V4L2 VP9 library changed the 'eob' member of
>         'struct v4l2_vp9_frame_symbol_counts' so that it is an array
>         of pointers instead of an array of pointers to arrays
>         (IPs such as g2 creatively pass parts of the 'eob' counts in
>         the 'coeff' counts)
> - factored out several repeated portions of code
> - minor nitpicks and cleanups
>
> Andrzej Pietrasiewicz (6):
>   media: uapi: Add VP9 stateless decoder controls
>   media: Add VP9 v4l2 library
>   media: hantro: Rename registers
>   media: hantro: Prepare for other G2 codecs
>   media: hantro: Support VP9 on the G2 core
>   media: hantro: Support NV12 on the G2 core
>
> Boris Brezillon (1):
>   media: rkvdec: Add the VP9 backend
>
> Ezequiel Garcia (4):
>   hantro: postproc: Fix motion vector space size
>   hantro: postproc: Introduce struct hantro_postproc_ops
>   hantro: Simplify postprocessor
>   hantro: Add quirk for NV12/NV12_4L4 capture format
>
>  .../userspace-api/media/v4l/biblio.rst        |   10 +
>  .../media/v4l/ext-ctrls-codec-stateless.rst   |  573 +++++
>  .../media/v4l/pixfmt-compressed.rst           |   15 +
>  .../media/v4l/vidioc-g-ext-ctrls.rst          |    8 +
>  .../media/v4l/vidioc-queryctrl.rst            |   12 +
>  .../media/videodev2.h.rst.exceptions          |    2 +
>  drivers/media/v4l2-core/Kconfig               |    4 +
>  drivers/media/v4l2-core/Makefile              |    1 +
>  drivers/media/v4l2-core/v4l2-ctrls-core.c     |  180 ++
>  drivers/media/v4l2-core/v4l2-ctrls-defs.c     |    8 +
>  drivers/media/v4l2-core/v4l2-ioctl.c          |    1 +
>  drivers/media/v4l2-core/v4l2-vp9.c            | 1850 +++++++++++++++++
>  drivers/staging/media/hantro/Kconfig          |    1 +
>  drivers/staging/media/hantro/Makefile         |    7 +-
>  drivers/staging/media/hantro/hantro.h         |   40 +-
>  drivers/staging/media/hantro/hantro_drv.c     |   23 +-
>  drivers/staging/media/hantro/hantro_g2.c      |   27 +
>  .../staging/media/hantro/hantro_g2_hevc_dec.c |   69 +-
>  drivers/staging/media/hantro/hantro_g2_regs.h |  132 +-
>  .../staging/media/hantro/hantro_g2_vp9_dec.c  |  980 +++++++++
>  drivers/staging/media/hantro/hantro_hw.h      |   83 +-
>  .../staging/media/hantro/hantro_postproc.c    |   79 +-
>  drivers/staging/media/hantro/hantro_v4l2.c    |   20 +
>  drivers/staging/media/hantro/hantro_vp9.c     |  240 +++
>  drivers/staging/media/hantro/hantro_vp9.h     |  103 +
>  drivers/staging/media/hantro/imx8m_vpu_hw.c   |   38 +-
>  .../staging/media/hantro/rockchip_vpu_hw.c    |    7 +-
>  .../staging/media/hantro/sama5d4_vdec_hw.c    |    3 +-
>  drivers/staging/media/rkvdec/Kconfig          |    1 +
>  drivers/staging/media/rkvdec/Makefile         |    2 +-
>  drivers/staging/media/rkvdec/rkvdec-vp9.c     | 1078 ++++++++++
>  drivers/staging/media/rkvdec/rkvdec.c         |   52 +-
>  drivers/staging/media/rkvdec/rkvdec.h         |   12 +-
>  include/media/v4l2-ctrls.h                    |    4 +
>  include/media/v4l2-vp9.h                      |  182 ++
>  include/uapi/linux/v4l2-controls.h            |  284 +++
>  include/uapi/linux/videodev2.h                |    6 +
>  37 files changed, 6033 insertions(+), 104 deletions(-)
>  create mode 100644 drivers/media/v4l2-core/v4l2-vp9.c
>  create mode 100644 drivers/staging/media/hantro/hantro_g2.c
>  create mode 100644 drivers/staging/media/hantro/hantro_g2_vp9_dec.c
>  create mode 100644 drivers/staging/media/hantro/hantro_vp9.c
>  create mode 100644 drivers/staging/media/hantro/hantro_vp9.h
>  create mode 100644 drivers/staging/media/rkvdec/rkvdec-vp9.c
>  create mode 100644 include/media/v4l2-vp9.h
>
>
> base-commit: e4e737bb5c170df6135a127739a9e6148ee3da82
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 07/11] media: rkvdec: Add the VP9 backend
  2021-09-29 16:04 ` [PATCH v7 07/11] media: rkvdec: Add the VP9 backend Andrzej Pietrasiewicz
  2021-10-08 10:30   ` Chen-Yu Tsai
@ 2021-10-19 23:24   ` Alex Bee
  2021-10-20 13:07     ` Andrzej Pietrasiewicz
  1 sibling, 1 reply; 37+ messages in thread
From: Alex Bee @ 2021-10-19 23:24 UTC (permalink / raw)
  To: Andrzej Pietrasiewicz, linux-media, linux-arm-kernel,
	linux-kernel, linux-rockchip, linux-staging
  Cc: Benjamin Gaignard, Boris Brezillon, Ezequiel Garcia,
	Fabio Estevam, Greg Kroah-Hartman, Hans Verkuil, Heiko Stuebner,
	Jernej Skrabec, Mauro Carvalho Chehab, Nicolas Dufresne,
	NXP Linux Team, Pengutronix Kernel Team, Philipp Zabel,
	Sascha Hauer, Shawn Guo, kernel, Ezequiel Garcia, Adrian Ratiu

Hi Andrzej,

Am 29.09.21 um 18:04 schrieb Andrzej Pietrasiewicz:
> From: Boris Brezillon <boris.brezillon@collabora.com>
> 
> The Rockchip VDEC supports VP9 profile 0 up to 4096x2304@30fps. Add
> a backend for this new format.
> 
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
> Co-developed-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
> Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
> ---
>   drivers/staging/media/rkvdec/Kconfig      |    1 +
>   drivers/staging/media/rkvdec/Makefile     |    2 +-
>   drivers/staging/media/rkvdec/rkvdec-vp9.c | 1078 +++++++++++++++++++++
>   drivers/staging/media/rkvdec/rkvdec.c     |   52 +-
>   drivers/staging/media/rkvdec/rkvdec.h     |   12 +-
>   5 files changed, 1137 insertions(+), 8 deletions(-)
>   create mode 100644 drivers/staging/media/rkvdec/rkvdec-vp9.c
> 
> diff --git a/drivers/staging/media/rkvdec/Kconfig b/drivers/staging/media/rkvdec/Kconfig
> index c02199b5e0fd..dc7292f346fa 100644
> --- a/drivers/staging/media/rkvdec/Kconfig
> +++ b/drivers/staging/media/rkvdec/Kconfig
> @@ -9,6 +9,7 @@ config VIDEO_ROCKCHIP_VDEC
>   	select VIDEOBUF2_VMALLOC
>   	select V4L2_MEM2MEM_DEV
>   	select V4L2_H264
> +	select V4L2_VP9
>   	help
>   	  Support for the Rockchip Video Decoder IP present on Rockchip SoCs,
>   	  which accelerates video decoding.
> diff --git a/drivers/staging/media/rkvdec/Makefile b/drivers/staging/media/rkvdec/Makefile
> index c08fed0a39f9..cb86b429cfaa 100644
> --- a/drivers/staging/media/rkvdec/Makefile
> +++ b/drivers/staging/media/rkvdec/Makefile
> @@ -1,3 +1,3 @@
>   obj-$(CONFIG_VIDEO_ROCKCHIP_VDEC) += rockchip-vdec.o
>   
> -rockchip-vdec-y += rkvdec.o rkvdec-h264.o
> +rockchip-vdec-y += rkvdec.o rkvdec-h264.o rkvdec-vp9.o
> diff --git a/drivers/staging/media/rkvdec/rkvdec-vp9.c b/drivers/staging/media/rkvdec/rkvdec-vp9.c
> new file mode 100644
> index 000000000000..ca463f18651a
> --- /dev/null
> +++ b/drivers/staging/media/rkvdec/rkvdec-vp9.c
> @@ -0,0 +1,1078 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Rockchip Video Decoder VP9 backend
> + *
> + * Copyright (C) 2019 Collabora, Ltd.
> + *	Boris Brezillon <boris.brezillon@collabora.com>
> + * Copyright (C) 2021 Collabora, Ltd.
> + *	Andrzej Pietrasiewicz <andrzej.p@collabora.com>
> + *
> + * Copyright (C) 2016 Rockchip Electronics Co., Ltd.
> + *	Alpha Lin <Alpha.Lin@rock-chips.com>
> + */
> +
> +/*
> + * For following the vp9 spec please start reading this driver
> + * code from rkvdec_vp9_run() followed by rkvdec_vp9_done().
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/vmalloc.h>
> +#include <media/v4l2-mem2mem.h>
> +#include <media/v4l2-vp9.h>
> +
> +#include "rkvdec.h"
> +#include "rkvdec-regs.h"
> +
> +#define RKVDEC_VP9_PROBE_SIZE		4864
> +#define RKVDEC_VP9_COUNT_SIZE		13232
> +#define RKVDEC_VP9_MAX_SEGMAP_SIZE	73728
> +
> +struct rkvdec_vp9_intra_mode_probs {
> +	u8 y_mode[105];
> +	u8 uv_mode[23];
> +};
> +
> +struct rkvdec_vp9_intra_only_frame_probs {
> +	u8 coef_intra[4][2][128];
> +	struct rkvdec_vp9_intra_mode_probs intra_mode[10];
> +};
> +
> +struct rkvdec_vp9_inter_frame_probs {
> +	u8 y_mode[4][9];
> +	u8 comp_mode[5];
> +	u8 comp_ref[5];
> +	u8 single_ref[5][2];
> +	u8 inter_mode[7][3];
> +	u8 interp_filter[4][2];
> +	u8 padding0[11];
> +	u8 coef[2][4][2][128];
> +	u8 uv_mode_0_2[3][9];
> +	u8 padding1[5];
> +	u8 uv_mode_3_5[3][9];
> +	u8 padding2[5];
> +	u8 uv_mode_6_8[3][9];
> +	u8 padding3[5];
> +	u8 uv_mode_9[9];
> +	u8 padding4[7];
> +	u8 padding5[16];
> +	struct {
> +		u8 joint[3];
> +		u8 sign[2];
> +		u8 classes[2][10];
> +		u8 class0_bit[2];
> +		u8 bits[2][10];
> +		u8 class0_fr[2][2][3];
> +		u8 fr[2][3];
> +		u8 class0_hp[2];
> +		u8 hp[2];
> +	} mv;
> +};
> +
> +struct rkvdec_vp9_probs {
> +	u8 partition[16][3];
> +	u8 pred[3];
> +	u8 tree[7];
> +	u8 skip[3];
> +	u8 tx32[2][3];
> +	u8 tx16[2][2];
> +	u8 tx8[2][1];
> +	u8 is_inter[4];
> +	/* 128 bit alignment */
> +	u8 padding0[3];
> +	union {
> +		struct rkvdec_vp9_inter_frame_probs inter;
> +		struct rkvdec_vp9_intra_only_frame_probs intra_only;
> +	};
> +};
> +
> +/* Data structure describing auxiliary buffer format. */
> +struct rkvdec_vp9_priv_tbl {
> +	struct rkvdec_vp9_probs probs;
> +	u8 segmap[2][RKVDEC_VP9_MAX_SEGMAP_SIZE];
> +};
> +
> +struct rkvdec_vp9_refs_counts {
> +	u32 eob[2];
> +	u32 coeff[3];
> +};
> +
> +struct rkvdec_vp9_inter_frame_symbol_counts {
> +	u32 partition[16][4];
> +	u32 skip[3][2];
> +	u32 inter[4][2];
> +	u32 tx32p[2][4];
> +	u32 tx16p[2][4];
> +	u32 tx8p[2][2];
> +	u32 y_mode[4][10];
> +	u32 uv_mode[10][10];
> +	u32 comp[5][2];
> +	u32 comp_ref[5][2];
> +	u32 single_ref[5][2][2];
> +	u32 mv_mode[7][4];
> +	u32 filter[4][3];
> +	u32 mv_joint[4];
> +	u32 sign[2][2];
> +	/* add 1 element for align */
> +	u32 classes[2][11 + 1];
> +	u32 class0[2][2];
> +	u32 bits[2][10][2];
> +	u32 class0_fp[2][2][4];
> +	u32 fp[2][4];
> +	u32 class0_hp[2][2];
> +	u32 hp[2][2];
> +	struct rkvdec_vp9_refs_counts ref_cnt[2][4][2][6][6];
> +};
> +
> +struct rkvdec_vp9_intra_frame_symbol_counts {
> +	u32 partition[4][4][4];
> +	u32 skip[3][2];
> +	u32 intra[4][2];
> +	u32 tx32p[2][4];
> +	u32 tx16p[2][4];
> +	u32 tx8p[2][2];
> +	struct rkvdec_vp9_refs_counts ref_cnt[2][4][2][6][6];
> +};
> +
> +struct rkvdec_vp9_run {
> +	struct rkvdec_run base;
> +	const struct v4l2_ctrl_vp9_frame *decode_params;
> +};
> +
> +struct rkvdec_vp9_frame_info {
> +	u32 valid : 1;
> +	u32 segmapid : 1;
> +	u32 frame_context_idx : 2;
> +	u32 reference_mode : 2;
> +	u32 tx_mode : 3;
> +	u32 interpolation_filter : 3;
> +	u32 flags;
> +	u64 timestamp;
> +	struct v4l2_vp9_segmentation seg;
> +	struct v4l2_vp9_loop_filter lf;
> +};
> +
> +struct rkvdec_vp9_ctx {
> +	struct rkvdec_aux_buf priv_tbl;
> +	struct rkvdec_aux_buf count_tbl;
> +	struct v4l2_vp9_frame_symbol_counts inter_cnts;
> +	struct v4l2_vp9_frame_symbol_counts intra_cnts;
> +	struct v4l2_vp9_frame_context probability_tables;
> +	struct v4l2_vp9_frame_context frame_context[4];
> +	struct rkvdec_vp9_frame_info cur;
> +	struct rkvdec_vp9_frame_info last;
> +};
> +
> +static void write_coeff_plane(const u8 coef[6][6][3], u8 *coeff_plane)
> +{
> +	unsigned int idx = 0, byte_count = 0;
> +	int k, m, n;
> +	u8 p;
> +
> +	for (k = 0; k < 6; k++) {
> +		for (m = 0; m < 6; m++) {
> +			for (n = 0; n < 3; n++) {
> +				p = coef[k][m][n];
> +				coeff_plane[idx++] = p;
> +				byte_count++;
> +				if (byte_count == 27) {
> +					idx += 5;
> +					byte_count = 0;
> +				}
> +			}
> +		}
> +	}
> +}
> +
> +static void init_intra_only_probs(struct rkvdec_ctx *ctx,
> +				  const struct rkvdec_vp9_run *run)
> +{
> +	const struct v4l2_ctrl_vp9_frame *dec_params;
> +	struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv;
> +	struct rkvdec_vp9_priv_tbl *tbl = vp9_ctx->priv_tbl.cpu;
> +	struct rkvdec_vp9_intra_only_frame_probs *rkprobs;
> +	const struct v4l2_vp9_frame_context *probs;
> +	unsigned int i, j, k, m;
> +
> +	rkprobs = &tbl->probs.intra_only;
> +	dec_params = run->decode_params;
> +	probs = &vp9_ctx->probability_tables;
> +
> +	/*
> +	 * intra only 149 x 128 bits ,aligned to 152 x 128 bits coeff related
> +	 * prob 64 x 128 bits
> +	 */
> +	for (i = 0; i < ARRAY_SIZE(probs->coef); i++) {
> +		for (j = 0; j < ARRAY_SIZE(probs->coef[0]); j++)
> +			write_coeff_plane(probs->coef[i][j][0],
> +					  rkprobs->coef_intra[i][j]);
> +	}
> +
> +	/* intra mode prob  80 x 128 bits */
> +	for (i = 0; i < ARRAY_SIZE(v4l2_vp9_kf_y_mode_prob); i++) {
> +		unsigned int byte_count = 0;
> +		int idx = 0;
> +
> +		/* vp9_kf_y_mode_prob */
> +		for (j = 0; j < ARRAY_SIZE(v4l2_vp9_kf_y_mode_prob[0]); j++) {
> +			for (k = 0; k < ARRAY_SIZE(v4l2_vp9_kf_y_mode_prob[0][0]);
> +			     k++) {
> +				u8 val = v4l2_vp9_kf_y_mode_prob[i][j][k];
> +
> +				rkprobs->intra_mode[i].y_mode[idx++] = val;
> +				byte_count++;
> +				if (byte_count == 27) {
> +					byte_count = 0;
> +					idx += 5;
> +				}
> +			}
> +		}
> +
> +		idx = 0;
> +		if (i < 4) {
> +			for (m = 0; m < (i < 3 ? 23 : 21); m++) {
> +				const u8 *ptr = (const u8 *)v4l2_vp9_kf_uv_mode_prob;
> +
> +				rkprobs->intra_mode[i].uv_mode[idx++] = ptr[i * 23 + m];
> +			}
> +		}
> +	}
> +}
> +
> +static void init_inter_probs(struct rkvdec_ctx *ctx,
> +			     const struct rkvdec_vp9_run *run)
> +{
> +	const struct v4l2_ctrl_vp9_frame *dec_params;
> +	struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv;
> +	struct rkvdec_vp9_priv_tbl *tbl = vp9_ctx->priv_tbl.cpu;
> +	struct rkvdec_vp9_inter_frame_probs *rkprobs;
> +	const struct v4l2_vp9_frame_context *probs;
> +	unsigned int i, j, k;
> +
> +	rkprobs = &tbl->probs.inter;
> +	dec_params = run->decode_params;
> +	probs = &vp9_ctx->probability_tables;
> +
> +	/*
> +	 * inter probs
> +	 * 151 x 128 bits, aligned to 152 x 128 bits
> +	 * inter only
> +	 * intra_y_mode & inter_block info 6 x 128 bits
> +	 */
> +
> +	memcpy(rkprobs->y_mode, probs->y_mode, sizeof(rkprobs->y_mode));
> +	memcpy(rkprobs->comp_mode, probs->comp_mode,
> +	       sizeof(rkprobs->comp_mode));
> +	memcpy(rkprobs->comp_ref, probs->comp_ref,
> +	       sizeof(rkprobs->comp_ref));
> +	memcpy(rkprobs->single_ref, probs->single_ref,
> +	       sizeof(rkprobs->single_ref));
> +	memcpy(rkprobs->inter_mode, probs->inter_mode,
> +	       sizeof(rkprobs->inter_mode));
> +	memcpy(rkprobs->interp_filter, probs->interp_filter,
> +	       sizeof(rkprobs->interp_filter));
> +
> +	/* 128 x 128 bits coeff related */
> +	for (i = 0; i < ARRAY_SIZE(probs->coef); i++) {
> +		for (j = 0; j < ARRAY_SIZE(probs->coef[0]); j++) {
> +			for (k = 0; k < ARRAY_SIZE(probs->coef[0][0]); k++)
> +				write_coeff_plane(probs->coef[i][j][k],
> +						  rkprobs->coef[k][i][j]);
> +		}
> +	}
> +
> +	/* intra uv mode 6 x 128 */
> +	memcpy(rkprobs->uv_mode_0_2, &probs->uv_mode[0],
> +	       sizeof(rkprobs->uv_mode_0_2));
> +	memcpy(rkprobs->uv_mode_3_5, &probs->uv_mode[3],
> +	       sizeof(rkprobs->uv_mode_3_5));
> +	memcpy(rkprobs->uv_mode_6_8, &probs->uv_mode[6],
> +	       sizeof(rkprobs->uv_mode_6_8));
> +	memcpy(rkprobs->uv_mode_9, &probs->uv_mode[9],
> +	       sizeof(rkprobs->uv_mode_9));
> +
> +	/* mv related 6 x 128 */
> +	memcpy(rkprobs->mv.joint, probs->mv.joint,
> +	       sizeof(rkprobs->mv.joint));
> +	memcpy(rkprobs->mv.sign, probs->mv.sign,
> +	       sizeof(rkprobs->mv.sign));
> +	memcpy(rkprobs->mv.classes, probs->mv.classes,
> +	       sizeof(rkprobs->mv.classes));
> +	memcpy(rkprobs->mv.class0_bit, probs->mv.class0_bit,
> +	       sizeof(rkprobs->mv.class0_bit));
> +	memcpy(rkprobs->mv.bits, probs->mv.bits,
> +	       sizeof(rkprobs->mv.bits));
> +	memcpy(rkprobs->mv.class0_fr, probs->mv.class0_fr,
> +	       sizeof(rkprobs->mv.class0_fr));
> +	memcpy(rkprobs->mv.fr, probs->mv.fr,
> +	       sizeof(rkprobs->mv.fr));
> +	memcpy(rkprobs->mv.class0_hp, probs->mv.class0_hp,
> +	       sizeof(rkprobs->mv.class0_hp));
> +	memcpy(rkprobs->mv.hp, probs->mv.hp,
> +	       sizeof(rkprobs->mv.hp));
> +}
> +
> +static void init_probs(struct rkvdec_ctx *ctx,
> +		       const struct rkvdec_vp9_run *run)
> +{
> +	const struct v4l2_ctrl_vp9_frame *dec_params;
> +	struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv;
> +	struct rkvdec_vp9_priv_tbl *tbl = vp9_ctx->priv_tbl.cpu;
> +	struct rkvdec_vp9_probs *rkprobs = &tbl->probs;
> +	const struct v4l2_vp9_segmentation *seg;
> +	const struct v4l2_vp9_frame_context *probs;
> +	bool intra_only;
> +
> +	dec_params = run->decode_params;
> +	probs = &vp9_ctx->probability_tables;
> +	seg = &dec_params->seg;
> +
> +	memset(rkprobs, 0, sizeof(*rkprobs));
> +
> +	intra_only = !!(dec_params->flags &
> +			(V4L2_VP9_FRAME_FLAG_KEY_FRAME |
> +			 V4L2_VP9_FRAME_FLAG_INTRA_ONLY));
> +
> +	/* sb info  5 x 128 bit */
> +	memcpy(rkprobs->partition,
> +	       intra_only ? v4l2_vp9_kf_partition_probs : probs->partition,
> +	       sizeof(rkprobs->partition));
> +
> +	memcpy(rkprobs->pred, seg->pred_probs, sizeof(rkprobs->pred));
> +	memcpy(rkprobs->tree, seg->tree_probs, sizeof(rkprobs->tree));
> +	memcpy(rkprobs->skip, probs->skip, sizeof(rkprobs->skip));
> +	memcpy(rkprobs->tx32, probs->tx32, sizeof(rkprobs->tx32));
> +	memcpy(rkprobs->tx16, probs->tx16, sizeof(rkprobs->tx16));
> +	memcpy(rkprobs->tx8, probs->tx8, sizeof(rkprobs->tx8));
> +	memcpy(rkprobs->is_inter, probs->is_inter, sizeof(rkprobs->is_inter));
> +
> +	if (intra_only)
> +		init_intra_only_probs(ctx, run);
> +	else
> +		init_inter_probs(ctx, run);
> +}
> +
> +struct rkvdec_vp9_ref_reg {
> +	u32 reg_frm_size;
> +	u32 reg_hor_stride;
> +	u32 reg_y_stride;
> +	u32 reg_yuv_stride;
> +	u32 reg_ref_base;
> +};
> +
> +static struct rkvdec_vp9_ref_reg ref_regs[] = {
> +	{
> +		.reg_frm_size = RKVDEC_REG_VP9_FRAME_SIZE(0),
> +		.reg_hor_stride = RKVDEC_VP9_HOR_VIRSTRIDE(0),
> +		.reg_y_stride = RKVDEC_VP9_LAST_FRAME_YSTRIDE,
> +		.reg_yuv_stride = RKVDEC_VP9_LAST_FRAME_YUVSTRIDE,
> +		.reg_ref_base = RKVDEC_REG_VP9_LAST_FRAME_BASE,
> +	},
> +	{
> +		.reg_frm_size = RKVDEC_REG_VP9_FRAME_SIZE(1),
> +		.reg_hor_stride = RKVDEC_VP9_HOR_VIRSTRIDE(1),
> +		.reg_y_stride = RKVDEC_VP9_GOLDEN_FRAME_YSTRIDE,
> +		.reg_yuv_stride = 0,
> +		.reg_ref_base = RKVDEC_REG_VP9_GOLDEN_FRAME_BASE,
> +	},
> +	{
> +		.reg_frm_size = RKVDEC_REG_VP9_FRAME_SIZE(2),
> +		.reg_hor_stride = RKVDEC_VP9_HOR_VIRSTRIDE(2),
> +		.reg_y_stride = RKVDEC_VP9_ALTREF_FRAME_YSTRIDE,
> +		.reg_yuv_stride = 0,
> +		.reg_ref_base = RKVDEC_REG_VP9_ALTREF_FRAME_BASE,
> +	}
> +};
> +
> +static struct rkvdec_decoded_buffer *
> +get_ref_buf(struct rkvdec_ctx *ctx, struct vb2_v4l2_buffer *dst, u64 timestamp)
> +{
> +	struct v4l2_m2m_ctx *m2m_ctx = ctx->fh.m2m_ctx;
> +	struct vb2_queue *cap_q = &m2m_ctx->cap_q_ctx.q;
> +	int buf_idx;
> +
> +	/*
> +	 * If a ref is unused or invalid, address of current destination
> +	 * buffer is returned.
> +	 */
> +	buf_idx = vb2_find_timestamp(cap_q, timestamp, 0);
> +	if (buf_idx < 0)
> +		return vb2_to_rkvdec_decoded_buf(&dst->vb2_buf);
> +
> +	return vb2_to_rkvdec_decoded_buf(vb2_get_buffer(cap_q, buf_idx));
> +}
> +
> +static dma_addr_t get_mv_base_addr(struct rkvdec_decoded_buffer *buf)
> +{
> +	unsigned int aligned_pitch, aligned_height, yuv_len;
> +
> +	aligned_height = round_up(buf->vp9.height, 64);
> +	aligned_pitch = round_up(buf->vp9.width * buf->vp9.bit_depth, 512) / 8;
> +	yuv_len = (aligned_height * aligned_pitch * 3) / 2;
> +
> +	return vb2_dma_contig_plane_dma_addr(&buf->base.vb.vb2_buf, 0) +
> +	       yuv_len;
> +}
> +
> +static void config_ref_registers(struct rkvdec_ctx *ctx,
> +				 const struct rkvdec_vp9_run *run,
> +				 struct rkvdec_decoded_buffer *ref_buf,
> +				 struct rkvdec_vp9_ref_reg *ref_reg)
> +{
> +	unsigned int aligned_pitch, aligned_height, y_len, yuv_len;
> +	struct rkvdec_dev *rkvdec = ctx->dev;
> +
> +	aligned_height = round_up(ref_buf->vp9.height, 64);
> +	writel_relaxed(RKVDEC_VP9_FRAMEWIDTH(ref_buf->vp9.width) |
> +		       RKVDEC_VP9_FRAMEHEIGHT(ref_buf->vp9.height),
> +		       rkvdec->regs + ref_reg->reg_frm_size);
> +
> +	writel_relaxed(vb2_dma_contig_plane_dma_addr(&ref_buf->base.vb.vb2_buf, 0),
> +		       rkvdec->regs + ref_reg->reg_ref_base);
> +
> +	if (&ref_buf->base.vb == run->base.bufs.dst)
> +		return;
> +
> +	aligned_pitch = round_up(ref_buf->vp9.width * ref_buf->vp9.bit_depth, 512) / 8;
> +	y_len = aligned_height * aligned_pitch;
> +	yuv_len = (y_len * 3) / 2;
> +
> +	writel_relaxed(RKVDEC_HOR_Y_VIRSTRIDE(aligned_pitch / 16) |
> +		       RKVDEC_HOR_UV_VIRSTRIDE(aligned_pitch / 16),
> +		       rkvdec->regs + ref_reg->reg_hor_stride);
> +	writel_relaxed(RKVDEC_VP9_REF_YSTRIDE(y_len / 16),
> +		       rkvdec->regs + ref_reg->reg_y_stride);
> +
> +	if (!ref_reg->reg_yuv_stride)
> +		return;
> +
> +	writel_relaxed(RKVDEC_VP9_REF_YUVSTRIDE(yuv_len / 16),
> +		       rkvdec->regs + ref_reg->reg_yuv_stride);
> +}
> +
> +static void config_seg_registers(struct rkvdec_ctx *ctx, unsigned int segid)
> +{
> +	struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv;
> +	const struct v4l2_vp9_segmentation *seg;
> +	struct rkvdec_dev *rkvdec = ctx->dev;
> +	s16 feature_val;
> +	int feature_id;
> +	u32 val = 0;
> +
> +	seg = vp9_ctx->last.valid ? &vp9_ctx->last.seg : &vp9_ctx->cur.seg;
> +	feature_id = V4L2_VP9_SEG_LVL_ALT_Q;
> +	if (v4l2_vp9_seg_feat_enabled(seg->feature_enabled, feature_id, segid)) {
> +		feature_val = seg->feature_data[segid][feature_id];
> +		val |= RKVDEC_SEGID_FRAME_QP_DELTA_EN(1) |
> +		       RKVDEC_SEGID_FRAME_QP_DELTA(feature_val);
> +	}
> +
> +	feature_id = V4L2_VP9_SEG_LVL_ALT_L;
> +	if (v4l2_vp9_seg_feat_enabled(seg->feature_enabled, feature_id, segid)) {
> +		feature_val = seg->feature_data[segid][feature_id];
> +		val |= RKVDEC_SEGID_FRAME_LOOPFILTER_VALUE_EN(1) |
> +		       RKVDEC_SEGID_FRAME_LOOPFILTER_VALUE(feature_val);
> +	}
> +
> +	feature_id = V4L2_VP9_SEG_LVL_REF_FRAME;
> +	if (v4l2_vp9_seg_feat_enabled(seg->feature_enabled, feature_id, segid)) {
> +		feature_val = seg->feature_data[segid][feature_id];
> +		val |= RKVDEC_SEGID_REFERINFO_EN(1) |
> +		       RKVDEC_SEGID_REFERINFO(feature_val);
> +	}
> +
> +	feature_id = V4L2_VP9_SEG_LVL_SKIP;
> +	if (v4l2_vp9_seg_feat_enabled(seg->feature_enabled, feature_id, segid))
> +		val |= RKVDEC_SEGID_FRAME_SKIP_EN(1);
> +
> +	if (!segid &&
> +	    (seg->flags & V4L2_VP9_SEGMENTATION_FLAG_ABS_OR_DELTA_UPDATE))
> +		val |= RKVDEC_SEGID_ABS_DELTA(1);
> +
> +	writel_relaxed(val, rkvdec->regs + RKVDEC_VP9_SEGID_GRP(segid));
> +}
> +
> +static void update_dec_buf_info(struct rkvdec_decoded_buffer *buf,
> +				const struct v4l2_ctrl_vp9_frame *dec_params)
> +{
> +	buf->vp9.width = dec_params->frame_width_minus_1 + 1;
> +	buf->vp9.height = dec_params->frame_height_minus_1 + 1;
> +	buf->vp9.bit_depth = dec_params->bit_depth;
> +}
> +
> +static void update_ctx_cur_info(struct rkvdec_vp9_ctx *vp9_ctx,
> +				struct rkvdec_decoded_buffer *buf,
> +				const struct v4l2_ctrl_vp9_frame *dec_params)
> +{
> +	vp9_ctx->cur.valid = true;
> +	vp9_ctx->cur.reference_mode = dec_params->reference_mode;
> +	vp9_ctx->cur.interpolation_filter = dec_params->interpolation_filter;
> +	vp9_ctx->cur.flags = dec_params->flags;
> +	vp9_ctx->cur.timestamp = buf->base.vb.vb2_buf.timestamp;
> +	vp9_ctx->cur.seg = dec_params->seg;
> +	vp9_ctx->cur.lf = dec_params->lf;
> +}
> +
> +static void update_ctx_last_info(struct rkvdec_vp9_ctx *vp9_ctx)
> +{
> +	vp9_ctx->last = vp9_ctx->cur;
> +}
> +
> +static void config_registers(struct rkvdec_ctx *ctx,
> +			     const struct rkvdec_vp9_run *run)
> +{
> +	unsigned int y_len, uv_len, yuv_len, bit_depth, aligned_height, aligned_pitch, stream_len;
> +	const struct v4l2_ctrl_vp9_frame *dec_params;
> +	struct rkvdec_decoded_buffer *ref_bufs[3];
> +	struct rkvdec_decoded_buffer *dst, *last, *mv_ref;
> +	struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv;
> +	u32 val, last_frame_info = 0;
> +	const struct v4l2_vp9_segmentation *seg;
> +	struct rkvdec_dev *rkvdec = ctx->dev;
> +	dma_addr_t addr;
> +	bool intra_only;
> +	unsigned int i;
> +
> +	dec_params = run->decode_params;
> +	dst = vb2_to_rkvdec_decoded_buf(&run->base.bufs.dst->vb2_buf);
> +	ref_bufs[0] = get_ref_buf(ctx, &dst->base.vb, dec_params->last_frame_ts);
> +	ref_bufs[1] = get_ref_buf(ctx, &dst->base.vb, dec_params->golden_frame_ts);
> +	ref_bufs[2] = get_ref_buf(ctx, &dst->base.vb, dec_params->alt_frame_ts);
> +
> +	if (vp9_ctx->last.valid)
> +		last = get_ref_buf(ctx, &dst->base.vb, vp9_ctx->last.timestamp);
> +	else
> +		last = dst;
> +
> +	update_dec_buf_info(dst, dec_params);
> +	update_ctx_cur_info(vp9_ctx, dst, dec_params);
> +	seg = &dec_params->seg;
> +
> +	intra_only = !!(dec_params->flags &
> +			(V4L2_VP9_FRAME_FLAG_KEY_FRAME |
> +			 V4L2_VP9_FRAME_FLAG_INTRA_ONLY));
> +
> +	writel_relaxed(RKVDEC_MODE(RKVDEC_MODE_VP9),
> +		       rkvdec->regs + RKVDEC_REG_SYSCTRL);
> +
> +	bit_depth = dec_params->bit_depth;
> +	aligned_height = round_up(ctx->decoded_fmt.fmt.pix_mp.height, 64);
> +
> +	aligned_pitch = round_up(ctx->decoded_fmt.fmt.pix_mp.width *
> +				 bit_depth,
> +				 512) / 8;
> +	y_len = aligned_height * aligned_pitch;
> +	uv_len = y_len / 2;
> +	yuv_len = y_len + uv_len;
> +
> +	writel_relaxed(RKVDEC_Y_HOR_VIRSTRIDE(aligned_pitch / 16) |
> +		       RKVDEC_UV_HOR_VIRSTRIDE(aligned_pitch / 16),
> +		       rkvdec->regs + RKVDEC_REG_PICPAR);
> +	writel_relaxed(RKVDEC_Y_VIRSTRIDE(y_len / 16),
> +		       rkvdec->regs + RKVDEC_REG_Y_VIRSTRIDE);
> +	writel_relaxed(RKVDEC_YUV_VIRSTRIDE(yuv_len / 16),
> +		       rkvdec->regs + RKVDEC_REG_YUV_VIRSTRIDE);
> +
> +	stream_len = vb2_get_plane_payload(&run->base.bufs.src->vb2_buf, 0);
> +	writel_relaxed(RKVDEC_STRM_LEN(stream_len),
> +		       rkvdec->regs + RKVDEC_REG_STRM_LEN);
> +
> +	/*
> +	 * Reset count buffer, because decoder only output intra related syntax
> +	 * counts when decoding intra frame, but update entropy need to update
> +	 * all the probabilities.
> +	 */
> +	if (intra_only)
> +		memset(vp9_ctx->count_tbl.cpu, 0, vp9_ctx->count_tbl.size);
> +
> +	vp9_ctx->cur.segmapid = vp9_ctx->last.segmapid;
> +	if (!intra_only &&
> +	    !(dec_params->flags & V4L2_VP9_FRAME_FLAG_ERROR_RESILIENT) &&
> +	    (!(seg->flags & V4L2_VP9_SEGMENTATION_FLAG_ENABLED) ||
> +	     (seg->flags & V4L2_VP9_SEGMENTATION_FLAG_UPDATE_MAP)))
> +		vp9_ctx->cur.segmapid++;
> +
> +	for (i = 0; i < ARRAY_SIZE(ref_bufs); i++)
> +		config_ref_registers(ctx, run, ref_bufs[i], &ref_regs[i]);
> +
> +	for (i = 0; i < 8; i++)
> +		config_seg_registers(ctx, i);
> +
> +	writel_relaxed(RKVDEC_VP9_TX_MODE(vp9_ctx->cur.tx_mode) |
> +		       RKVDEC_VP9_FRAME_REF_MODE(dec_params->reference_mode),
> +		       rkvdec->regs + RKVDEC_VP9_CPRHEADER_CONFIG);
> +
> +	if (!intra_only) {
> +		const struct v4l2_vp9_loop_filter *lf;
> +		s8 delta;
> +
> +		if (vp9_ctx->last.valid)
> +			lf = &vp9_ctx->last.lf;
> +		else
> +			lf = &vp9_ctx->cur.lf;
> +
> +		val = 0;
> +		for (i = 0; i < ARRAY_SIZE(lf->ref_deltas); i++) {
> +			delta = lf->ref_deltas[i];
> +			val |= RKVDEC_REF_DELTAS_LASTFRAME(i, delta);
> +		}
> +
> +		writel_relaxed(val,
> +			       rkvdec->regs + RKVDEC_VP9_REF_DELTAS_LASTFRAME);
> +
> +		for (i = 0; i < ARRAY_SIZE(lf->mode_deltas); i++) {
> +			delta = lf->mode_deltas[i];
> +			last_frame_info |= RKVDEC_MODE_DELTAS_LASTFRAME(i,
> +									delta);
> +		}
> +	}
> +
> +	if (vp9_ctx->last.valid && !intra_only &&
> +	    vp9_ctx->last.seg.flags & V4L2_VP9_SEGMENTATION_FLAG_ENABLED)
> +		last_frame_info |= RKVDEC_SEG_EN_LASTFRAME;
> +
> +	if (vp9_ctx->last.valid &&
> +	    vp9_ctx->last.flags & V4L2_VP9_FRAME_FLAG_SHOW_FRAME)
> +		last_frame_info |= RKVDEC_LAST_SHOW_FRAME;
> +
> +	if (vp9_ctx->last.valid &&
> +	    vp9_ctx->last.flags &
> +	    (V4L2_VP9_FRAME_FLAG_KEY_FRAME | V4L2_VP9_FRAME_FLAG_INTRA_ONLY))
> +		last_frame_info |= RKVDEC_LAST_INTRA_ONLY;
> +
> +	if (vp9_ctx->last.valid &&
> +	    last->vp9.width == dst->vp9.width &&
> +	    last->vp9.height == dst->vp9.height)
> +		last_frame_info |= RKVDEC_LAST_WIDHHEIGHT_EQCUR;
> +
> +	writel_relaxed(last_frame_info,
> +		       rkvdec->regs + RKVDEC_VP9_INFO_LASTFRAME);
> +
> +	writel_relaxed(stream_len - dec_params->compressed_header_size -
> +		       dec_params->uncompressed_header_size,
> +		       rkvdec->regs + RKVDEC_VP9_LASTTILE_SIZE);
> +
> +	for (i = 0; !intra_only && i < ARRAY_SIZE(ref_bufs); i++) {
> +		unsigned int refw = ref_bufs[i]->vp9.width;
> +		unsigned int refh = ref_bufs[i]->vp9.height;
> +		u32 hscale, vscale;
> +
> +		hscale = (refw << 14) /	dst->vp9.width;
> +		vscale = (refh << 14) / dst->vp9.height;
> +		writel_relaxed(RKVDEC_VP9_REF_HOR_SCALE(hscale) |
> +			       RKVDEC_VP9_REF_VER_SCALE(vscale),
> +			       rkvdec->regs + RKVDEC_VP9_REF_SCALE(i));
> +	}
> +
> +	addr = vb2_dma_contig_plane_dma_addr(&dst->base.vb.vb2_buf, 0);
> +	writel_relaxed(addr, rkvdec->regs + RKVDEC_REG_DECOUT_BASE);
> +	addr = vb2_dma_contig_plane_dma_addr(&run->base.bufs.src->vb2_buf, 0);
> +	writel_relaxed(addr, rkvdec->regs + RKVDEC_REG_STRM_RLC_BASE);
> +	writel_relaxed(vp9_ctx->priv_tbl.dma +
> +		       offsetof(struct rkvdec_vp9_priv_tbl, probs),
> +		       rkvdec->regs + RKVDEC_REG_CABACTBL_PROB_BASE);
> +	writel_relaxed(vp9_ctx->count_tbl.dma,
> +		       rkvdec->regs + RKVDEC_REG_VP9COUNT_BASE);
> +
> +	writel_relaxed(vp9_ctx->priv_tbl.dma +
> +		       offsetof(struct rkvdec_vp9_priv_tbl, segmap) +
> +		       (RKVDEC_VP9_MAX_SEGMAP_SIZE * vp9_ctx->cur.segmapid),
> +		       rkvdec->regs + RKVDEC_REG_VP9_SEGIDCUR_BASE);
> +	writel_relaxed(vp9_ctx->priv_tbl.dma +
> +		       offsetof(struct rkvdec_vp9_priv_tbl, segmap) +
> +		       (RKVDEC_VP9_MAX_SEGMAP_SIZE * (!vp9_ctx->cur.segmapid)),
> +		       rkvdec->regs + RKVDEC_REG_VP9_SEGIDLAST_BASE);
> +
> +	if (!intra_only &&
> +	    !(dec_params->flags & V4L2_VP9_FRAME_FLAG_ERROR_RESILIENT) &&
> +	    vp9_ctx->last.valid)
> +		mv_ref = last;
> +	else
> +		mv_ref = dst;
> +
> +	writel_relaxed(get_mv_base_addr(mv_ref),
> +		       rkvdec->regs + RKVDEC_VP9_REF_COLMV_BASE);
> +
> +	writel_relaxed(ctx->decoded_fmt.fmt.pix_mp.width |
> +		       (ctx->decoded_fmt.fmt.pix_mp.height << 16),
> +		       rkvdec->regs + RKVDEC_REG_PERFORMANCE_CYCLE);
> +}
> +
> +static int validate_dec_params(struct rkvdec_ctx *ctx,
> +			       const struct v4l2_ctrl_vp9_frame *dec_params)
> +{
> +	unsigned int aligned_width, aligned_height;
> +
> +	/* We only support profile 0. */
> +	if (dec_params->profile != 0) {
> +		dev_err(ctx->dev->dev, "unsupported profile %d\n",
> +			dec_params->profile);
> +		return -EINVAL;
> +	}
> +
> +	aligned_width = round_up(dec_params->frame_width_minus_1 + 1, 64);
> +	aligned_height = round_up(dec_params->frame_height_minus_1 + 1, 64);
> +
> +	/*
> +	 * Userspace should update the capture/decoded format when the
> +	 * resolution changes.
> +	 */
> +	if (aligned_width != ctx->decoded_fmt.fmt.pix_mp.width ||
> +	    aligned_height != ctx->decoded_fmt.fmt.pix_mp.height) {
> +		dev_err(ctx->dev->dev,
> +			"unexpected bitstream resolution %dx%d\n",
> +			dec_params->frame_width_minus_1 + 1,
> +			dec_params->frame_height_minus_1 + 1);
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
> +static int rkvdec_vp9_run_preamble(struct rkvdec_ctx *ctx,
> +				   struct rkvdec_vp9_run *run)
> +{
> +	const struct v4l2_ctrl_vp9_frame *dec_params;
> +	const struct v4l2_ctrl_vp9_compressed_hdr *prob_updates;
> +	struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv;
> +	struct v4l2_ctrl *ctrl;
> +	unsigned int fctx_idx;
> +	int ret;
> +
> +	/* v4l2-specific stuff */
> +	rkvdec_run_preamble(ctx, &run->base);
> +
> +	ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl,
> +			      V4L2_CID_STATELESS_VP9_FRAME);
> +	if (WARN_ON(!ctrl))
> +		return -EINVAL;
> +	dec_params = ctrl->p_cur.p;
> +
> +	ret = validate_dec_params(ctx, dec_params);
> +	if (ret)
> +		return ret;
> +
> +	run->decode_params = dec_params;
> +
> +	ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, V4L2_CID_STATELESS_VP9_COMPRESSED_HDR);
> +	if (WARN_ON(!ctrl))
> +		return -EINVAL;
> +	prob_updates = ctrl->p_cur.p;
> +	vp9_ctx->cur.tx_mode = prob_updates->tx_mode;
> +
> +	/*
> +	 * vp9 stuff
> +	 *
> +	 * by this point the userspace has done all parts of 6.2 uncompressed_header()
> +	 * except this fragment:
> +	 * if ( FrameIsIntra || error_resilient_mode ) {
> +	 *	setup_past_independence ( )
> +	 *	if ( frame_type == KEY_FRAME || error_resilient_mode == 1 ||
> +	 *	     reset_frame_context == 3 ) {
> +	 *		for ( i = 0; i < 4; i ++ ) {
> +	 *			save_probs( i )
> +	 *		}
> +	 *	} else if ( reset_frame_context == 2 ) {
> +	 *		save_probs( frame_context_idx )
> +	 *	}
> +	 *	frame_context_idx = 0
> +	 * }
> +	 */
> +	fctx_idx = v4l2_vp9_reset_frame_ctx(dec_params, vp9_ctx->frame_context);
> +	vp9_ctx->cur.frame_context_idx = fctx_idx;
> +
> +	/* 6.1 frame(sz): load_probs() and load_probs2() */
> +	vp9_ctx->probability_tables = vp9_ctx->frame_context[fctx_idx];
> +
> +	/*
> +	 * The userspace has also performed 6.3 compressed_header(), but handling the
> +	 * probs in a special way. All probs which need updating, except MV-related,
> +	 * have been read from the bitstream and translated through inv_map_table[],
> +	 * but no 6.3.6 inv_recenter_nonneg(v, m) has been performed. The values passed
> +	 * by userspace are either translated values (there are no 0 values in
> +	 * inv_map_table[]), or zero to indicate no update. All MV-related probs which need
> +	 * updating have been read from the bitstream and (mv_prob << 1) | 1 has been
> +	 * performed. The values passed by userspace are either new values
> +	 * to replace old ones (the above mentioned shift and bitwise or never result in
> +	 * a zero) or zero to indicate no update.
> +	 * fw_update_probs() performs actual probs updates or leaves probs as-is
> +	 * for values for which a zero was passed from userspace.
> +	 */
> +	v4l2_vp9_fw_update_probs(&vp9_ctx->probability_tables, prob_updates, dec_params);
> +
> +	return 0;
> +}
> +
> +static int rkvdec_vp9_run(struct rkvdec_ctx *ctx)
> +{
> +	struct rkvdec_dev *rkvdec = ctx->dev;
> +	struct rkvdec_vp9_run run = { };
> +	int ret;
> +
> +	ret = rkvdec_vp9_run_preamble(ctx, &run);
> +	if (ret) {
> +		rkvdec_run_postamble(ctx, &run.base);
> +		return ret;
> +	}
> +
> +	/* Prepare probs. */
> +	init_probs(ctx, &run);
> +
> +	/* Configure hardware registers. */
> +	config_registers(ctx, &run);
> +
> +	rkvdec_run_postamble(ctx, &run.base);
> +
> +	schedule_delayed_work(&rkvdec->watchdog_work, msecs_to_jiffies(2000));
> +
> +	writel(1, rkvdec->regs + RKVDEC_REG_PREF_LUMA_CACHE_COMMAND);
> +	writel(1, rkvdec->regs + RKVDEC_REG_PREF_CHR_CACHE_COMMAND);
> +
> +	writel(0xe, rkvdec->regs + RKVDEC_REG_STRMD_ERR_EN);
> +	/* Start decoding! */
> +	writel(RKVDEC_INTERRUPT_DEC_E | RKVDEC_CONFIG_DEC_CLK_GATE_E |
> +	       RKVDEC_TIMEOUT_E | RKVDEC_BUF_EMPTY_E,
> +	       rkvdec->regs + RKVDEC_REG_INTERRUPT);
> +
> +	return 0;
> +}
> +
> +#define copy_tx_and_skip(p1, p2)				\
> +do {								\
> +	memcpy((p1)->tx8, (p2)->tx8, sizeof((p1)->tx8));	\
> +	memcpy((p1)->tx16, (p2)->tx16, sizeof((p1)->tx16));	\
> +	memcpy((p1)->tx32, (p2)->tx32, sizeof((p1)->tx32));	\
> +	memcpy((p1)->skip, (p2)->skip, sizeof((p1)->skip));	\
> +} while (0)
> +
> +static void rkvdec_vp9_done(struct rkvdec_ctx *ctx,
> +			    struct vb2_v4l2_buffer *src_buf,
> +			    struct vb2_v4l2_buffer *dst_buf,
> +			    enum vb2_buffer_state result)
> +{
> +	struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv;
> +	unsigned int fctx_idx;
> +
> +	/* v4l2-specific stuff */
> +	if (result == VB2_BUF_STATE_ERROR)
> +		goto out_update_last;
> +
> +	/*
> +	 * vp9 stuff
> +	 *
> +	 * 6.1.2 refresh_probs()
> +	 *
> +	 * In the spec a complementary condition goes last in 6.1.2 refresh_probs(),
> +	 * but it makes no sense to perform all the activities from the first "if"
> +	 * there if we actually are not refreshing the frame context. On top of that,
> +	 * because of 6.2 uncompressed_header() whenever error_resilient_mode == 1,
> +	 * refresh_frame_context == 0. Consequently, if we don't jump to out_update_last
> +	 * it means error_resilient_mode must be 0.
> +	 */
> +	if (!(vp9_ctx->cur.flags & V4L2_VP9_FRAME_FLAG_REFRESH_FRAME_CTX))
> +		goto out_update_last;
> +
> +	fctx_idx = vp9_ctx->cur.frame_context_idx;
> +
> +	if (!(vp9_ctx->cur.flags & V4L2_VP9_FRAME_FLAG_PARALLEL_DEC_MODE)) {
> +		/* error_resilient_mode == 0 && frame_parallel_decoding_mode == 0 */
> +		struct v4l2_vp9_frame_context *probs = &vp9_ctx->probability_tables;
> +		bool frame_is_intra = vp9_ctx->cur.flags &
> +		    (V4L2_VP9_FRAME_FLAG_KEY_FRAME | V4L2_VP9_FRAME_FLAG_INTRA_ONLY);
> +		struct tx_and_skip {
> +			u8 tx8[2][1];
> +			u8 tx16[2][2];
> +			u8 tx32[2][3];
> +			u8 skip[3];
> +		} _tx_skip, *tx_skip = &_tx_skip;
> +		struct v4l2_vp9_frame_symbol_counts *counts;
> +
> +		/* buffer the forward-updated TX and skip probs */
> +		if (frame_is_intra)
> +			copy_tx_and_skip(tx_skip, probs);
> +
> +		/* 6.1.2 refresh_probs(): load_probs() and load_probs2() */
> +		*probs = vp9_ctx->frame_context[fctx_idx];
> +
> +		/* if FrameIsIntra then undo the effect of load_probs2() */
> +		if (frame_is_intra)
> +			copy_tx_and_skip(probs, tx_skip);
> +
> +		counts = frame_is_intra ? &vp9_ctx->intra_cnts : &vp9_ctx->inter_cnts;
> +		v4l2_vp9_adapt_coef_probs(probs, counts,
> +					  !vp9_ctx->last.valid ||
> +					  vp9_ctx->last.flags & V4L2_VP9_FRAME_FLAG_KEY_FRAME,
> +					  frame_is_intra);
> +		if (!frame_is_intra) {
> +			const struct rkvdec_vp9_inter_frame_symbol_counts *inter_cnts;
> +			u32 classes[2][11];
> +			int i;
> +
> +			inter_cnts = vp9_ctx->count_tbl.cpu;
> +			for (i = 0; i < ARRAY_SIZE(classes); ++i)
> +				memcpy(classes[i], inter_cnts->classes[i], sizeof(classes[0]));
> +			counts->classes = &classes;
> +
> +			/* load_probs2() already done */
> +			v4l2_vp9_adapt_noncoef_probs(&vp9_ctx->probability_tables, counts,
> +						     vp9_ctx->cur.reference_mode,
> +						     vp9_ctx->cur.interpolation_filter,
> +						     vp9_ctx->cur.tx_mode, vp9_ctx->cur.flags);
> +		}
> +	}
> +
> +	/* 6.1.2 refresh_probs(): save_probs(fctx_idx) */
> +	vp9_ctx->frame_context[fctx_idx] = vp9_ctx->probability_tables;
> +
> +out_update_last:
> +	update_ctx_last_info(vp9_ctx);
> +}
> +
> +static void rkvdec_init_v4l2_vp9_count_tbl(struct rkvdec_ctx *ctx)
> +{
> +	struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv;
> +	struct rkvdec_vp9_intra_frame_symbol_counts *intra_cnts = vp9_ctx->count_tbl.cpu;
> +	struct rkvdec_vp9_inter_frame_symbol_counts *inter_cnts = vp9_ctx->count_tbl.cpu;
> +	int i, j, k, l, m;
> +
> +	vp9_ctx->inter_cnts.partition = &inter_cnts->partition;
> +	vp9_ctx->inter_cnts.skip = &inter_cnts->skip;
> +	vp9_ctx->inter_cnts.intra_inter = &inter_cnts->inter;
> +	vp9_ctx->inter_cnts.tx32p = &inter_cnts->tx32p;
> +	vp9_ctx->inter_cnts.tx16p = &inter_cnts->tx16p;
> +	vp9_ctx->inter_cnts.tx8p = &inter_cnts->tx8p;
> +
> +	vp9_ctx->intra_cnts.partition = (u32 (*)[16][4])(&intra_cnts->partition);
> +	vp9_ctx->intra_cnts.skip = &intra_cnts->skip;
> +	vp9_ctx->intra_cnts.intra_inter = &intra_cnts->intra;
> +	vp9_ctx->intra_cnts.tx32p = &intra_cnts->tx32p;
> +	vp9_ctx->intra_cnts.tx16p = &intra_cnts->tx16p;
> +	vp9_ctx->intra_cnts.tx8p = &intra_cnts->tx8p;
> +
> +	vp9_ctx->inter_cnts.y_mode = &inter_cnts->y_mode;
> +	vp9_ctx->inter_cnts.uv_mode = &inter_cnts->uv_mode;
> +	vp9_ctx->inter_cnts.comp = &inter_cnts->comp;
> +	vp9_ctx->inter_cnts.comp_ref = &inter_cnts->comp_ref;
> +	vp9_ctx->inter_cnts.single_ref = &inter_cnts->single_ref;
> +	vp9_ctx->inter_cnts.mv_mode = &inter_cnts->mv_mode;
> +	vp9_ctx->inter_cnts.filter = &inter_cnts->filter;
> +	vp9_ctx->inter_cnts.mv_joint = &inter_cnts->mv_joint;
> +	vp9_ctx->inter_cnts.sign = &inter_cnts->sign;
> +	/*
> +	 * rk hardware actually uses "u32 classes[2][11 + 1];"
> +	 * instead of "u32 classes[2][11];", so this must be explicitly
> +	 * copied into vp9_ctx->classes when passing the data to the
> +	 * vp9 library function
> +	 */
> +	vp9_ctx->inter_cnts.class0 = &inter_cnts->class0;
> +	vp9_ctx->inter_cnts.bits = &inter_cnts->bits;
> +	vp9_ctx->inter_cnts.class0_fp = &inter_cnts->class0_fp;
> +	vp9_ctx->inter_cnts.fp = &inter_cnts->fp;
> +	vp9_ctx->inter_cnts.class0_hp = &inter_cnts->class0_hp;
> +	vp9_ctx->inter_cnts.hp = &inter_cnts->hp;
> +
> +#define INNERMOST_LOOP \
> +	do {										\
> +		for (m = 0; m < ARRAY_SIZE(vp9_ctx->inter_cnts.coeff[0][0][0][0]); ++m) {\
> +			vp9_ctx->inter_cnts.coeff[i][j][k][l][m] =			\
> +				&inter_cnts->ref_cnt[k][i][j][l][m].coeff;		\
> +			vp9_ctx->inter_cnts.eob[i][j][k][l][m][0] =			\
> +				&inter_cnts->ref_cnt[k][i][j][l][m].eob[0];		\
> +			vp9_ctx->inter_cnts.eob[i][j][k][l][m][1] =			\
> +				&inter_cnts->ref_cnt[k][i][j][l][m].eob[1];		\
> +											\
> +			vp9_ctx->intra_cnts.coeff[i][j][k][l][m] =			\
> +				&intra_cnts->ref_cnt[k][i][j][l][m].coeff;		\
> +			vp9_ctx->intra_cnts.eob[i][j][k][l][m][0] =			\
> +				&intra_cnts->ref_cnt[k][i][j][l][m].eob[0];		\
> +			vp9_ctx->intra_cnts.eob[i][j][k][l][m][1] =			\
> +				&intra_cnts->ref_cnt[k][i][j][l][m].eob[1];		\
> +		}									\
> +	} while (0)
> +
> +	for (i = 0; i < ARRAY_SIZE(vp9_ctx->inter_cnts.coeff); ++i)
> +		for (j = 0; j < ARRAY_SIZE(vp9_ctx->inter_cnts.coeff[0]); ++j)
> +			for (k = 0; k < ARRAY_SIZE(vp9_ctx->inter_cnts.coeff[0][0]); ++k)
> +				for (l = 0; l < ARRAY_SIZE(vp9_ctx->inter_cnts.coeff[0][0][0]); ++l)
> +					INNERMOST_LOOP;
> +#undef INNERMOST_LOOP
> +}
> +
> +static int rkvdec_vp9_start(struct rkvdec_ctx *ctx)
> +{
> +	struct rkvdec_dev *rkvdec = ctx->dev;
> +	struct rkvdec_vp9_priv_tbl *priv_tbl;
> +	struct rkvdec_vp9_ctx *vp9_ctx;
> +	unsigned char *count_tbl;
> +	int ret;
> +
> +	vp9_ctx = kzalloc(sizeof(*vp9_ctx), GFP_KERNEL);
> +	if (!vp9_ctx)
> +		return -ENOMEM;
> +
> +	ctx->priv = vp9_ctx;
> +
> +	priv_tbl = dma_alloc_coherent(rkvdec->dev, sizeof(*priv_tbl),
> +				      &vp9_ctx->priv_tbl.dma, GFP_KERNEL);
> +	if (!priv_tbl) {
> +		ret = -ENOMEM;
> +		goto err_free_ctx;
> +	}
> +
> +	vp9_ctx->priv_tbl.size = sizeof(*priv_tbl);
> +	vp9_ctx->priv_tbl.cpu = priv_tbl;
> +	memset(priv_tbl, 0, sizeof(*priv_tbl));
> +
> +	count_tbl = dma_alloc_coherent(rkvdec->dev, RKVDEC_VP9_COUNT_SIZE,
> +				       &vp9_ctx->count_tbl.dma, GFP_KERNEL);
> +	if (!count_tbl) {
> +		ret = -ENOMEM;
> +		goto err_free_priv_tbl;
> +	}
> +
> +	vp9_ctx->count_tbl.size = RKVDEC_VP9_COUNT_SIZE;
> +	vp9_ctx->count_tbl.cpu = count_tbl;
> +	memset(count_tbl, 0, sizeof(*count_tbl));
> +	rkvdec_init_v4l2_vp9_count_tbl(ctx);
> +
> +	return 0;
> +
> +err_free_priv_tbl:
> +	dma_free_coherent(rkvdec->dev, vp9_ctx->priv_tbl.size,
> +			  vp9_ctx->priv_tbl.cpu, vp9_ctx->priv_tbl.dma);
> +
> +err_free_ctx:
> +	kfree(vp9_ctx);
> +	return ret;
> +}
> +
> +static void rkvdec_vp9_stop(struct rkvdec_ctx *ctx)
> +{
> +	struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv;
> +	struct rkvdec_dev *rkvdec = ctx->dev;
> +
> +	dma_free_coherent(rkvdec->dev, vp9_ctx->count_tbl.size,
> +			  vp9_ctx->count_tbl.cpu, vp9_ctx->count_tbl.dma);
> +	dma_free_coherent(rkvdec->dev, vp9_ctx->priv_tbl.size,
> +			  vp9_ctx->priv_tbl.cpu, vp9_ctx->priv_tbl.dma);
> +	kfree(vp9_ctx);
> +}
> +
> +static int rkvdec_vp9_adjust_fmt(struct rkvdec_ctx *ctx,
> +				 struct v4l2_format *f)
> +{
> +	struct v4l2_pix_format_mplane *fmt = &f->fmt.pix_mp;
> +
> +	fmt->num_planes = 1;
> +	if (!fmt->plane_fmt[0].sizeimage)
> +		fmt->plane_fmt[0].sizeimage = fmt->width * fmt->height * 2;
> +	return 0;
> +}
> +
> +const struct rkvdec_coded_fmt_ops rkvdec_vp9_fmt_ops = {
> +	.adjust_fmt = rkvdec_vp9_adjust_fmt,
> +	.start = rkvdec_vp9_start,
> +	.stop = rkvdec_vp9_stop,
> +	.run = rkvdec_vp9_run,
> +	.done = rkvdec_vp9_done,
> +};
> diff --git a/drivers/staging/media/rkvdec/rkvdec.c b/drivers/staging/media/rkvdec/rkvdec.c
> index 7131156c1f2c..6aa8aca66547 100644
> --- a/drivers/staging/media/rkvdec/rkvdec.c
> +++ b/drivers/staging/media/rkvdec/rkvdec.c
> @@ -99,10 +99,30 @@ static const struct rkvdec_ctrls rkvdec_h264_ctrls = {
>   	.num_ctrls = ARRAY_SIZE(rkvdec_h264_ctrl_descs),
>   };
>   
> -static const u32 rkvdec_h264_decoded_fmts[] = {
> +static const u32 rkvdec_h264_vp9_decoded_fmts[] = {
>   	V4L2_PIX_FMT_NV12,

For H.264 rkvdec HW supports additional formats: V4L2_PIX_FMT_NV15, 
V4L2_PIX_FMT_NV16 and V4L2_PIX_FMT_NV20. Not all of those are upstreamed 
yet and thus not supported by rkvdec driver - but I think we should 
introduce a seperate rkvdec_vp9_decoded_fmts already a this point. (To 
avoid unnecessary diff afterwards)

>   };
>   
> +static const struct rkvdec_ctrl_desc rkvdec_vp9_ctrl_descs[] = {
> +	{
> +		.cfg.id = V4L2_CID_STATELESS_VP9_FRAME,
> +	},
> +	{
> +		.cfg.id = V4L2_CID_STATELESS_VP9_COMPRESSED_HDR,
> +	},
> +	{
> +		.cfg.id = V4L2_CID_MPEG_VIDEO_VP9_PROFILE,
> +		.cfg.min = V4L2_MPEG_VIDEO_VP9_PROFILE_0,
> +		.cfg.max = V4L2_MPEG_VIDEO_VP9_PROFILE_0,
> +		.cfg.def = V4L2_MPEG_VIDEO_VP9_PROFILE_0,
> +	},
> +};
> +
> +static const struct rkvdec_ctrls rkvdec_vp9_ctrls = {
> +	.ctrls = rkvdec_vp9_ctrl_descs,
> +	.num_ctrls = ARRAY_SIZE(rkvdec_vp9_ctrl_descs),
> +};
> +
>   static const struct rkvdec_coded_fmt_desc rkvdec_coded_fmts[] = {
>   	{
>   		.fourcc = V4L2_PIX_FMT_H264_SLICE,
> @@ -116,8 +136,23 @@ static const struct rkvdec_coded_fmt_desc rkvdec_coded_fmts[] = {
>   		},
>   		.ctrls = &rkvdec_h264_ctrls,
>   		.ops = &rkvdec_h264_fmt_ops,
> -		.num_decoded_fmts = ARRAY_SIZE(rkvdec_h264_decoded_fmts),
> -		.decoded_fmts = rkvdec_h264_decoded_fmts,
> +		.num_decoded_fmts = ARRAY_SIZE(rkvdec_h264_vp9_decoded_fmts),
> +		.decoded_fmts = rkvdec_h264_vp9_decoded_fmts,
> +	},
> +	{
> +		.fourcc = V4L2_PIX_FMT_VP9_FRAME,
> +		.frmsize = {
> +			.min_width = 64,
> +			.max_width = 4096,
> +			.step_width = 64,
> +			.min_height = 64,
> +			.max_height = 2304,
> +			.step_height = 64,
> +		},
I checked (available) documentation and couldn't find any hint to the 
.step_width and .step_height, but I'm not sure that's correct: taking
this values here neither framesize of 3840x2160 nor 1280x720 would be 
possible - but the HW seems to have no problem with those, i.e. decoding 
works fine.
Given the output format is the same as the (only) currently supported 
H.264 output format (NV12) and those steps are usually for alignment 
purposes need by the HW , I strongly guess .step_height and .step_width 
are the same as V4L2_PIX_FMT_H264_SLICE has.

Regards,
Alex
> +		.ctrls = &rkvdec_vp9_ctrls,
> +		.ops = &rkvdec_vp9_fmt_ops,
> +		.num_decoded_fmts = ARRAY_SIZE(rkvdec_h264_vp9_decoded_fmts),
> +		.decoded_fmts = rkvdec_h264_vp9_decoded_fmts,
>   	}
>   };
>   
> @@ -319,7 +354,7 @@ static int rkvdec_s_output_fmt(struct file *file, void *priv,
>   	struct v4l2_m2m_ctx *m2m_ctx = ctx->fh.m2m_ctx;
>   	const struct rkvdec_coded_fmt_desc *desc;
>   	struct v4l2_format *cap_fmt;
> -	struct vb2_queue *peer_vq;
> +	struct vb2_queue *peer_vq, *vq;
>   	int ret;
>   
>   	/*
> @@ -331,6 +366,15 @@ static int rkvdec_s_output_fmt(struct file *file, void *priv,
>   	if (vb2_is_busy(peer_vq))
>   		return -EBUSY;
>   
> +	/*
> +	 * Some codecs like VP9 can contain dynamic resolution changes which
> +	 * are currently not supported by the V4L2 API or driver, so return
> +	 * an error if userspace tries to reconfigure the output format.
> +	 */
> +	vq = v4l2_m2m_get_vq(m2m_ctx, V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE);
> +	if (vb2_is_busy(vq))
> +		return -EINVAL;
> +
>   	ret = rkvdec_s_fmt(file, priv, f, rkvdec_try_output_fmt);
>   	if (ret)
>   		return ret;
> diff --git a/drivers/staging/media/rkvdec/rkvdec.h b/drivers/staging/media/rkvdec/rkvdec.h
> index 52ac3874c5e5..2f4ea1786b93 100644
> --- a/drivers/staging/media/rkvdec/rkvdec.h
> +++ b/drivers/staging/media/rkvdec/rkvdec.h
> @@ -42,14 +42,18 @@ struct rkvdec_run {
>   
>   struct rkvdec_vp9_decoded_buffer_info {
>   	/* Info needed when the decoded frame serves as a reference frame. */
> -	u16 width;
> -	u16 height;
> -	u32 bit_depth : 4;
> +	unsigned short width;
> +	unsigned short height;
> +	unsigned int bit_depth : 4;
>   };
>   
>   struct rkvdec_decoded_buffer {
>   	/* Must be the first field in this struct. */
>   	struct v4l2_m2m_buffer base;
> +
> +	union {
> +		struct rkvdec_vp9_decoded_buffer_info vp9;
> +	};
>   };
>   
>   static inline struct rkvdec_decoded_buffer *
> @@ -116,4 +120,6 @@ void rkvdec_run_preamble(struct rkvdec_ctx *ctx, struct rkvdec_run *run);
>   void rkvdec_run_postamble(struct rkvdec_ctx *ctx, struct rkvdec_run *run);
>   
>   extern const struct rkvdec_coded_fmt_ops rkvdec_h264_fmt_ops;
> +extern const struct rkvdec_coded_fmt_ops rkvdec_vp9_fmt_ops;
> +
>   #endif /* RKVDEC_H_ */
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Re: [PATCH v7 11/11] media: hantro: Support NV12 on the G2 core
  2021-10-19 16:38       ` Jernej Škrabec
@ 2021-10-20 11:06         ` Ezequiel Garcia
  2021-10-20 15:04           ` Jernej Škrabec
  0 siblings, 1 reply; 37+ messages in thread
From: Ezequiel Garcia @ 2021-10-20 11:06 UTC (permalink / raw)
  To: Jernej Škrabec
  Cc: linux-media, linux-arm-kernel, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...,
	open list:STAGING SUBSYSTEM, Andrzej Pietrasiewicz,
	Benjamin Gaignard, Boris Brezillon, Fabio Estevam,
	Greg Kroah-Hartman, Hans Verkuil, Heiko Stuebner,
	Mauro Carvalho Chehab, Nicolas Dufresne, NXP Linux Team,
	Pengutronix Kernel Team, Philipp Zabel, Sascha Hauer, Shawn Guo,
	Collabora Kernel ML, Ezequiel Garcia

Hi Jernej,

On Tue, 19 Oct 2021 at 13:38, Jernej Škrabec <jernej.skrabec@gmail.com> wrote:
>
> Hi Andrzej!
>
> Dne petek, 15. oktober 2021 ob 19:19:47 CEST je Andrzej Pietrasiewicz
> napisal(a):
> > Hi Jernej,
> >
> > W dniu 14.10.2021 o 19:42, Jernej Škrabec pisze:
> > > Hi Andrzej!
> > >
> > > Dne sreda, 29. september 2021 ob 18:04:39 CEST je Andrzej Pietrasiewicz
> > > napisal(a):
> > >> The G2 decoder block produces NV12 4x4 tiled format (NV12_4L4).
> > >> Enable the G2 post-processor block, in order to produce regular NV12.
> > >>
> > >> The logic in hantro_postproc.c is leveraged to take care of allocating
> > >> the extra buffers and configure the post-processor, which is
> > >> significantly simpler than the one on the G1.
> > >
> > > Quick summary of discussion on LibreELEC Slack:
> > > When using NV12 format on Allwinner H6 variant of G2 (needs some driver
> > > changes), I get frames out of order. If I use native NV12 tiled format,
> frames
> > > are ordered correctly.
> > >
> > > Currently I'm not sure if this is issue with my changes or is this general
> > > issue.
> > >
> > > I would be grateful if anyone can test frame order with and without
> > > postprocessing enabled on imx8. Take some dynamic video with a lot of
> short
> > > scenes. It's pretty obvious when frames are out of order.
> > >
> >
> > I checked on imx8 and cannot observe any such artifacts.
>
> I finally found the issue. As you mentioned on Slack, register write order once
> already affected decoding. Well, it's the case again. I made hacky test and
> moved postproc enable call after output buffers are set and it worked. So, this
> is actually core quirk which is obviously fixed in newer variants.
>

Ugh, good catch.

What happens if you move all the calls to HANTRO_PP_REG_WRITE_S
(HANTRO_PP_REG_WRITE does a relaxed write)?

Or what happens if the HANTRO_PP_REG_WRITE(vpu, out_luma_base, dst_dma)
is moved to be done after all the other registers?

> This makes this series with minor adaptations completely working on H6. I see
> no reason not to merge whole series.
>

Do you have plans to submit your H6 work on top of this?

Thanks,
Ezequiel


> Thanks for testing.
>
> Best regards,
> Jernej
>
> >
> > Andrzej
> >
> > > However, given that frames themself are correctly decoded and without
> > > postprocessing in right order, that shouldn't block merging previous
> patches.
> > > I tried few different videos and frames were all decoded correctly.
> > >
> > > Best regards,
> > > Jernej
> > >
> > >>
> > >> Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
> > >> Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
> > >> ---
> > >>   .../staging/media/hantro/hantro_g2_vp9_dec.c  |  6 ++--
> > >>   drivers/staging/media/hantro/hantro_hw.h      |  1 +
> > >>   .../staging/media/hantro/hantro_postproc.c    | 31 +++++++++++++++++++
> > >>   drivers/staging/media/hantro/imx8m_vpu_hw.c   | 11 +++++++
> > >>   4 files changed, 46 insertions(+), 3 deletions(-)
> > >>
> > >> diff --git a/drivers/staging/media/hantro/hantro_g2_vp9_dec.c b/drivers/
> > > staging/media/hantro/hantro_g2_vp9_dec.c
> > >> index 7f827b9f0133..1a26be72c878 100644
> > >> --- a/drivers/staging/media/hantro/hantro_g2_vp9_dec.c
> > >> +++ b/drivers/staging/media/hantro/hantro_g2_vp9_dec.c
> > >> @@ -152,7 +152,7 @@ static void config_output(struct hantro_ctx *ctx,
> > >>    hantro_reg_write(ctx->dev, &g2_out_dis, 0);
> > >>    hantro_reg_write(ctx->dev, &g2_output_format, 0);
> > >>
> > >> -  luma_addr = vb2_dma_contig_plane_dma_addr(&dst->base.vb.vb2_buf,
> > > 0);
> > >> +  luma_addr = hantro_get_dec_buf_addr(ctx, &dst->base.vb.vb2_buf);
> > >>    hantro_write_addr(ctx->dev, G2_OUT_LUMA_ADDR, luma_addr);
> > >>
> > >>    chroma_addr = luma_addr + chroma_offset(ctx, dec_params);
> > >> @@ -191,7 +191,7 @@ static void config_ref(struct hantro_ctx *ctx,
> > >>    hantro_reg_write(ctx->dev, &ref_reg->hor_scale, (refw << 14) /
> > > dst->vp9.width);
> > >>    hantro_reg_write(ctx->dev, &ref_reg->ver_scale, (refh << 14) /
> > > dst->vp9.height);
> > >>
> > >> -  luma_addr = vb2_dma_contig_plane_dma_addr(&buf->base.vb.vb2_buf,
> > > 0);
> > >> +  luma_addr = hantro_get_dec_buf_addr(ctx, &buf->base.vb.vb2_buf);
> > >>    hantro_write_addr(ctx->dev, ref_reg->y_base, luma_addr);
> > >>
> > >>    chroma_addr = luma_addr + chroma_offset(ctx, dec_params);
> > >> @@ -236,7 +236,7 @@ static void config_ref_registers(struct hantro_ctx
> *ctx,
> > >>    config_ref(ctx, dst, &ref_regs[1], dec_params, dec_params-
> > >> golden_frame_ts);
> > >>    config_ref(ctx, dst, &ref_regs[2], dec_params, dec_params-
> > >> alt_frame_ts);
> > >>
> > >> -  mv_addr = vb2_dma_contig_plane_dma_addr(&mv_ref->base.vb.vb2_buf,
> > > 0) +
> > >> +  mv_addr = hantro_get_dec_buf_addr(ctx, &mv_ref->base.vb.vb2_buf) +
> > >>              mv_offset(ctx, dec_params);
> > >>    hantro_write_addr(ctx->dev, G2_REF_MV_ADDR(0), mv_addr);
> > >>
> > >> diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/
> > > media/hantro/hantro_hw.h
> > >> index 2961d399fd60..3d4a5dc1e6d5 100644
> > >> --- a/drivers/staging/media/hantro/hantro_hw.h
> > >> +++ b/drivers/staging/media/hantro/hantro_hw.h
> > >> @@ -274,6 +274,7 @@ extern const struct hantro_variant
> rk3399_vpu_variant;
> > >>   extern const struct hantro_variant sama5d4_vdec_variant;
> > >>
> > >>   extern const struct hantro_postproc_ops hantro_g1_postproc_ops;
> > >> +extern const struct hantro_postproc_ops hantro_g2_postproc_ops;
> > >>
> > >>   extern const u32 hantro_vp8_dec_mc_filter[8][6];
> > >>
> > >> diff --git a/drivers/staging/media/hantro/hantro_postproc.c b/drivers/
> > > staging/media/hantro/hantro_postproc.c
> > >> index 4549aec08feb..79a66d001738 100644
> > >> --- a/drivers/staging/media/hantro/hantro_postproc.c
> > >> +++ b/drivers/staging/media/hantro/hantro_postproc.c
> > >> @@ -11,6 +11,7 @@
> > >>   #include "hantro.h"
> > >>   #include "hantro_hw.h"
> > >>   #include "hantro_g1_regs.h"
> > >> +#include "hantro_g2_regs.h"
> > >>
> > >>   #define HANTRO_PP_REG_WRITE(vpu, reg_name, val) \
> > >>   { \
> > >> @@ -99,6 +100,21 @@ static void hantro_postproc_g1_enable(struct
> hantro_ctx
> > > *ctx)
> > >>    HANTRO_PP_REG_WRITE(vpu, display_width, ctx->dst_fmt.width);
> > >>   }
> > >>
> > >> +static void hantro_postproc_g2_enable(struct hantro_ctx *ctx)
> > >> +{
> > >> +  struct hantro_dev *vpu = ctx->dev;
> > >> +  struct vb2_v4l2_buffer *dst_buf;
> > >> +  size_t chroma_offset = ctx->dst_fmt.width * ctx->dst_fmt.height;
> > >> +  dma_addr_t dst_dma;
> > >> +
> > >> +  dst_buf = hantro_get_dst_buf(ctx);
> > >> +  dst_dma = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0);
> > >> +
> > >> +  hantro_write_addr(vpu, G2_RS_OUT_LUMA_ADDR, dst_dma);
> > >> +  hantro_write_addr(vpu, G2_RS_OUT_CHROMA_ADDR, dst_dma +
> > > chroma_offset);
> > >> +  hantro_reg_write(vpu, &g2_out_rs_e, 1);
> > >> +}
> > >> +
> > >>   void hantro_postproc_free(struct hantro_ctx *ctx)
> > >>   {
> > >>    struct hantro_dev *vpu = ctx->dev;
> > >> @@ -127,6 +143,9 @@ int hantro_postproc_alloc(struct hantro_ctx *ctx)
> > >>    if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_H264_SLICE)
> > >>            buf_size += hantro_h264_mv_size(ctx->dst_fmt.width,
> > >>                                            ctx-
> > >> dst_fmt.height);
> > >> +  else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_VP9_FRAME)
> > >> +          buf_size += hantro_vp9_mv_size(ctx->dst_fmt.width,
> > >> +                                         ctx-
> > >> dst_fmt.height);
> > >>
> > >>    for (i = 0; i < num_buffers; ++i) {
> > >>            struct hantro_aux_buf *priv = &ctx->postproc.dec_q[i];
> > >> @@ -152,6 +171,13 @@ static void hantro_postproc_g1_disable(struct
> > > hantro_ctx *ctx)
> > >>    HANTRO_PP_REG_WRITE_S(vpu, pipeline_en, 0x0);
> > >>   }
> > >>
> > >> +static void hantro_postproc_g2_disable(struct hantro_ctx *ctx)
> > >> +{
> > >> +  struct hantro_dev *vpu = ctx->dev;
> > >> +
> > >> +  hantro_reg_write(vpu, &g2_out_rs_e, 0);
> > >> +}
> > >> +
> > >>   void hantro_postproc_disable(struct hantro_ctx *ctx)
> > >>   {
> > >>    struct hantro_dev *vpu = ctx->dev;
> > >> @@ -172,3 +198,8 @@ const struct hantro_postproc_ops
> hantro_g1_postproc_ops
> > > = {
> > >>    .enable = hantro_postproc_g1_enable,
> > >>    .disable = hantro_postproc_g1_disable,
> > >>   };
> > >> +
> > >> +const struct hantro_postproc_ops hantro_g2_postproc_ops = {
> > >> +  .enable = hantro_postproc_g2_enable,
> > >> +  .disable = hantro_postproc_g2_disable,
> > >> +};
> > >> diff --git a/drivers/staging/media/hantro/imx8m_vpu_hw.c b/drivers/
> staging/
> > > media/hantro/imx8m_vpu_hw.c
> > >> index 455a107ffb02..1a43f6fceef9 100644
> > >> --- a/drivers/staging/media/hantro/imx8m_vpu_hw.c
> > >> +++ b/drivers/staging/media/hantro/imx8m_vpu_hw.c
> > >> @@ -132,6 +132,14 @@ static const struct hantro_fmt imx8m_vpu_dec_fmts[]
> = {
> > >>    },
> > >>   };
> > >>
> > >> +static const struct hantro_fmt imx8m_vpu_g2_postproc_fmts[] = {
> > >> +  {
> > >> +          .fourcc = V4L2_PIX_FMT_NV12,
> > >> +          .codec_mode = HANTRO_MODE_NONE,
> > >> +          .postprocessed = true,
> > >> +  },
> > >> +};
> > >> +
> > >>   static const struct hantro_fmt imx8m_vpu_g2_dec_fmts[] = {
> > >>    {
> > >>            .fourcc = V4L2_PIX_FMT_NV12_4L4,
> > >> @@ -301,6 +309,9 @@ const struct hantro_variant imx8mq_vpu_g2_variant = {
> > >>    .dec_offset = 0x0,
> > >>    .dec_fmts = imx8m_vpu_g2_dec_fmts,
> > >>    .num_dec_fmts = ARRAY_SIZE(imx8m_vpu_g2_dec_fmts),
> > >> +  .postproc_fmts = imx8m_vpu_g2_postproc_fmts,
> > >> +  .num_postproc_fmts = ARRAY_SIZE(imx8m_vpu_g2_postproc_fmts),
> > >> +  .postproc_ops = &hantro_g2_postproc_ops,
> > >>    .codec = HANTRO_HEVC_DECODER | HANTRO_VP9_DECODER,
> > >>    .codec_ops = imx8mq_vpu_g2_codec_ops,
> > >>    .init = imx8mq_vpu_hw_init,
> > >> --
> > >> 2.17.1
> > >>
> > >>
> > >
> > >
> >
> >
>
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 07/11] media: rkvdec: Add the VP9 backend
  2021-10-19 23:24   ` Alex Bee
@ 2021-10-20 13:07     ` Andrzej Pietrasiewicz
  0 siblings, 0 replies; 37+ messages in thread
From: Andrzej Pietrasiewicz @ 2021-10-20 13:07 UTC (permalink / raw)
  To: Alex Bee, linux-media, linux-arm-kernel, linux-kernel,
	linux-rockchip, linux-staging
  Cc: Benjamin Gaignard, Boris Brezillon, Ezequiel Garcia,
	Fabio Estevam, Greg Kroah-Hartman, Hans Verkuil, Heiko Stuebner,
	Jernej Skrabec, Mauro Carvalho Chehab, Nicolas Dufresne,
	NXP Linux Team, Pengutronix Kernel Team, Philipp Zabel,
	Sascha Hauer, Shawn Guo, kernel, Ezequiel Garcia, Adrian Ratiu

Hi Alex,

W dniu 20.10.2021 o 01:24, Alex Bee pisze:
> Hi Andrzej,
> 
> Am 29.09.21 um 18:04 schrieb Andrzej Pietrasiewicz:
>> From: Boris Brezillon <boris.brezillon@collabora.com>
>>
>> The Rockchip VDEC supports VP9 profile 0 up to 4096x2304@30fps. Add
>> a backend for this new format.
>>
>> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
>> Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
>> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
>> Co-developed-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
>> Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
>> ---
>>   drivers/staging/media/rkvdec/Kconfig      |    1 +
>>   drivers/staging/media/rkvdec/Makefile     |    2 +-
>>   drivers/staging/media/rkvdec/rkvdec-vp9.c | 1078 +++++++++++++++++++++
>>   drivers/staging/media/rkvdec/rkvdec.c     |   52 +-
>>   drivers/staging/media/rkvdec/rkvdec.h     |   12 +-
>>   5 files changed, 1137 insertions(+), 8 deletions(-)
>>   create mode 100644 drivers/staging/media/rkvdec/rkvdec-vp9.c

<snip>

>> diff --git a/drivers/staging/media/rkvdec/rkvdec.c 
>> b/drivers/staging/media/rkvdec/rkvdec.c
>> index 7131156c1f2c..6aa8aca66547 100644
>> --- a/drivers/staging/media/rkvdec/rkvdec.c
>> +++ b/drivers/staging/media/rkvdec/rkvdec.c
>> @@ -99,10 +99,30 @@ static const struct rkvdec_ctrls rkvdec_h264_ctrls = {
>>       .num_ctrls = ARRAY_SIZE(rkvdec_h264_ctrl_descs),
>>   };
>> -static const u32 rkvdec_h264_decoded_fmts[] = {
>> +static const u32 rkvdec_h264_vp9_decoded_fmts[] = {
>>       V4L2_PIX_FMT_NV12,
> 
> For H.264 rkvdec HW supports additional formats: V4L2_PIX_FMT_NV15, 
> V4L2_PIX_FMT_NV16 and V4L2_PIX_FMT_NV20. Not all of those are upstreamed yet and 
> thus not supported by rkvdec driver - but I think we should introduce a seperate 
> rkvdec_vp9_decoded_fmts already a this point. (To avoid unnecessary diff 
> afterwards)

I will do it if I get to re-spinning the series for other reasons.

> 
>>   };
>> +static const struct rkvdec_ctrl_desc rkvdec_vp9_ctrl_descs[] = {
>> +    {
>> +        .cfg.id = V4L2_CID_STATELESS_VP9_FRAME,
>> +    },
>> +    {
>> +        .cfg.id = V4L2_CID_STATELESS_VP9_COMPRESSED_HDR,
>> +    },
>> +    {
>> +        .cfg.id = V4L2_CID_MPEG_VIDEO_VP9_PROFILE,
>> +        .cfg.min = V4L2_MPEG_VIDEO_VP9_PROFILE_0,
>> +        .cfg.max = V4L2_MPEG_VIDEO_VP9_PROFILE_0,
>> +        .cfg.def = V4L2_MPEG_VIDEO_VP9_PROFILE_0,
>> +    },
>> +};
>> +
>> +static const struct rkvdec_ctrls rkvdec_vp9_ctrls = {
>> +    .ctrls = rkvdec_vp9_ctrl_descs,
>> +    .num_ctrls = ARRAY_SIZE(rkvdec_vp9_ctrl_descs),
>> +};
>> +
>>   static const struct rkvdec_coded_fmt_desc rkvdec_coded_fmts[] = {
>>       {
>>           .fourcc = V4L2_PIX_FMT_H264_SLICE,
>> @@ -116,8 +136,23 @@ static const struct rkvdec_coded_fmt_desc 
>> rkvdec_coded_fmts[] = {
>>           },
>>           .ctrls = &rkvdec_h264_ctrls,
>>           .ops = &rkvdec_h264_fmt_ops,
>> -        .num_decoded_fmts = ARRAY_SIZE(rkvdec_h264_decoded_fmts),
>> -        .decoded_fmts = rkvdec_h264_decoded_fmts,
>> +        .num_decoded_fmts = ARRAY_SIZE(rkvdec_h264_vp9_decoded_fmts),
>> +        .decoded_fmts = rkvdec_h264_vp9_decoded_fmts,
>> +    },
>> +    {
>> +        .fourcc = V4L2_PIX_FMT_VP9_FRAME,
>> +        .frmsize = {
>> +            .min_width = 64,
>> +            .max_width = 4096,
>> +            .step_width = 64,
>> +            .min_height = 64,
>> +            .max_height = 2304,
>> +            .step_height = 64,
>> +        },
> I checked (available) documentation and couldn't find any hint to the 
> .step_width and .step_height, but I'm not sure that's correct: taking
> this values here neither framesize of 3840x2160 nor 1280x720 would be possible - 
> but the HW seems to have no problem with those, i.e. decoding works fine.
> Given the output format is the same as the (only) currently supported H.264 
> output format (NV12) and those steps are usually for alignment purposes need by 
> the HW , I strongly guess .step_height and .step_width are the same as 
> V4L2_PIX_FMT_H264_SLICE has.
> 

Aren't these used primarily by v4l2_apply_frmsize_constraints()? Doesn't
this merely mean that even though userspace requests, say, 48x48,
it will get 64x64 instead?

I tried decoding a 720p video with gstreamer and it worked fine
(I got a properly sized 1280x720 output).

Regards,

Andrzej

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Re: Re: [PATCH v7 11/11] media: hantro: Support NV12 on the G2 core
  2021-10-20 11:06         ` Ezequiel Garcia
@ 2021-10-20 15:04           ` Jernej Škrabec
  2021-10-20 15:25             ` Ezequiel Garcia
  0 siblings, 1 reply; 37+ messages in thread
From: Jernej Škrabec @ 2021-10-20 15:04 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: linux-media, linux-arm-kernel, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...,
	open list:STAGING SUBSYSTEM, Andrzej Pietrasiewicz,
	Benjamin Gaignard, Boris Brezillon, Fabio Estevam,
	Greg Kroah-Hartman, Hans Verkuil, Heiko Stuebner,
	Mauro Carvalho Chehab, Nicolas Dufresne, NXP Linux Team,
	Pengutronix Kernel Team, Philipp Zabel, Sascha Hauer, Shawn Guo,
	Collabora Kernel ML, Ezequiel Garcia

Dne sreda, 20. oktober 2021 ob 13:06:59 CEST je Ezequiel Garcia napisal(a):
> Hi Jernej,
> 
> On Tue, 19 Oct 2021 at 13:38, Jernej Škrabec <jernej.skrabec@gmail.com> 
wrote:
> >
> > Hi Andrzej!
> >
> > Dne petek, 15. oktober 2021 ob 19:19:47 CEST je Andrzej Pietrasiewicz
> > napisal(a):
> > > Hi Jernej,
> > >
> > > W dniu 14.10.2021 o 19:42, Jernej Škrabec pisze:
> > > > Hi Andrzej!
> > > >
> > > > Dne sreda, 29. september 2021 ob 18:04:39 CEST je Andrzej 
Pietrasiewicz
> > > > napisal(a):
> > > >> The G2 decoder block produces NV12 4x4 tiled format (NV12_4L4).
> > > >> Enable the G2 post-processor block, in order to produce regular NV12.
> > > >>
> > > >> The logic in hantro_postproc.c is leveraged to take care of 
allocating
> > > >> the extra buffers and configure the post-processor, which is
> > > >> significantly simpler than the one on the G1.
> > > >
> > > > Quick summary of discussion on LibreELEC Slack:
> > > > When using NV12 format on Allwinner H6 variant of G2 (needs some 
driver
> > > > changes), I get frames out of order. If I use native NV12 tiled 
format,
> > frames
> > > > are ordered correctly.
> > > >
> > > > Currently I'm not sure if this is issue with my changes or is this 
general
> > > > issue.
> > > >
> > > > I would be grateful if anyone can test frame order with and without
> > > > postprocessing enabled on imx8. Take some dynamic video with a lot of
> > short
> > > > scenes. It's pretty obvious when frames are out of order.
> > > >
> > >
> > > I checked on imx8 and cannot observe any such artifacts.
> >
> > I finally found the issue. As you mentioned on Slack, register write order 
once
> > already affected decoding. Well, it's the case again. I made hacky test and
> > moved postproc enable call after output buffers are set and it worked. So, 
this
> > is actually core quirk which is obviously fixed in newer variants.
> >
> 
> Ugh, good catch.
> 
> What happens if you move all the calls to HANTRO_PP_REG_WRITE_S
> (HANTRO_PP_REG_WRITE does a relaxed write)?
> 
> Or what happens if the HANTRO_PP_REG_WRITE(vpu, out_luma_base, dst_dma)
> is moved to be done after all the other registers?

Those two macros aren't used on G2. Andrzej introduced new postproc helpers 
for G2.

This commit solves issue for H6:
https://github.com/jernejsk/linux-1/commit/
a783a977c0843bb4b555dc9d0b5d64915cd219e7

> 
> > This makes this series with minor adaptations completely working on H6. I 
see
> > no reason not to merge whole series.
> >
> 
> Do you have plans to submit your H6 work on top of this?

Of course, why would I work on this otherwise? :) But before I do that, I have 
to clean up and split one commit, which adapts VP9 G2 code for H6 variant.

If you're interested in changes, take a look here:
https://github.com/jernejsk/linux-1/commits/vp9

Best regards,
Jernej

> 
> Thanks,
> Ezequiel
> 
> 
> > Thanks for testing.
> >
> > Best regards,
> > Jernej
> >
> > >
> > > Andrzej
> > >
> > > > However, given that frames themself are correctly decoded and without
> > > > postprocessing in right order, that shouldn't block merging previous
> > patches.
> > > > I tried few different videos and frames were all decoded correctly.
> > > >
> > > > Best regards,
> > > > Jernej
> > > >
> > > >>
> > > >> Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
> > > >> Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
> > > >> ---
> > > >>   .../staging/media/hantro/hantro_g2_vp9_dec.c  |  6 ++--
> > > >>   drivers/staging/media/hantro/hantro_hw.h      |  1 +
> > > >>   .../staging/media/hantro/hantro_postproc.c    | 31 ++++++++++++++++
+++
> > > >>   drivers/staging/media/hantro/imx8m_vpu_hw.c   | 11 +++++++
> > > >>   4 files changed, 46 insertions(+), 3 deletions(-)
> > > >>
> > > >> diff --git a/drivers/staging/media/hantro/hantro_g2_vp9_dec.c b/
drivers/
> > > > staging/media/hantro/hantro_g2_vp9_dec.c
> > > >> index 7f827b9f0133..1a26be72c878 100644
> > > >> --- a/drivers/staging/media/hantro/hantro_g2_vp9_dec.c
> > > >> +++ b/drivers/staging/media/hantro/hantro_g2_vp9_dec.c
> > > >> @@ -152,7 +152,7 @@ static void config_output(struct hantro_ctx *ctx,
> > > >>    hantro_reg_write(ctx->dev, &g2_out_dis, 0);
> > > >>    hantro_reg_write(ctx->dev, &g2_output_format, 0);
> > > >>
> > > >> -  luma_addr = vb2_dma_contig_plane_dma_addr(&dst->base.vb.vb2_buf,
> > > > 0);
> > > >> +  luma_addr = hantro_get_dec_buf_addr(ctx, &dst->base.vb.vb2_buf);
> > > >>    hantro_write_addr(ctx->dev, G2_OUT_LUMA_ADDR, luma_addr);
> > > >>
> > > >>    chroma_addr = luma_addr + chroma_offset(ctx, dec_params);
> > > >> @@ -191,7 +191,7 @@ static void config_ref(struct hantro_ctx *ctx,
> > > >>    hantro_reg_write(ctx->dev, &ref_reg->hor_scale, (refw << 14) /
> > > > dst->vp9.width);
> > > >>    hantro_reg_write(ctx->dev, &ref_reg->ver_scale, (refh << 14) /
> > > > dst->vp9.height);
> > > >>
> > > >> -  luma_addr = vb2_dma_contig_plane_dma_addr(&buf->base.vb.vb2_buf,
> > > > 0);
> > > >> +  luma_addr = hantro_get_dec_buf_addr(ctx, &buf->base.vb.vb2_buf);
> > > >>    hantro_write_addr(ctx->dev, ref_reg->y_base, luma_addr);
> > > >>
> > > >>    chroma_addr = luma_addr + chroma_offset(ctx, dec_params);
> > > >> @@ -236,7 +236,7 @@ static void config_ref_registers(struct hantro_ctx
> > *ctx,
> > > >>    config_ref(ctx, dst, &ref_regs[1], dec_params, dec_params-
> > > >> golden_frame_ts);
> > > >>    config_ref(ctx, dst, &ref_regs[2], dec_params, dec_params-
> > > >> alt_frame_ts);
> > > >>
> > > >> -  mv_addr = vb2_dma_contig_plane_dma_addr(&mv_ref->base.vb.vb2_buf,
> > > > 0) +
> > > >> +  mv_addr = hantro_get_dec_buf_addr(ctx, &mv_ref->base.vb.vb2_buf) +
> > > >>              mv_offset(ctx, dec_params);
> > > >>    hantro_write_addr(ctx->dev, G2_REF_MV_ADDR(0), mv_addr);
> > > >>
> > > >> diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/
staging/
> > > > media/hantro/hantro_hw.h
> > > >> index 2961d399fd60..3d4a5dc1e6d5 100644
> > > >> --- a/drivers/staging/media/hantro/hantro_hw.h
> > > >> +++ b/drivers/staging/media/hantro/hantro_hw.h
> > > >> @@ -274,6 +274,7 @@ extern const struct hantro_variant
> > rk3399_vpu_variant;
> > > >>   extern const struct hantro_variant sama5d4_vdec_variant;
> > > >>
> > > >>   extern const struct hantro_postproc_ops hantro_g1_postproc_ops;
> > > >> +extern const struct hantro_postproc_ops hantro_g2_postproc_ops;
> > > >>
> > > >>   extern const u32 hantro_vp8_dec_mc_filter[8][6];
> > > >>
> > > >> diff --git a/drivers/staging/media/hantro/hantro_postproc.c b/drivers/
> > > > staging/media/hantro/hantro_postproc.c
> > > >> index 4549aec08feb..79a66d001738 100644
> > > >> --- a/drivers/staging/media/hantro/hantro_postproc.c
> > > >> +++ b/drivers/staging/media/hantro/hantro_postproc.c
> > > >> @@ -11,6 +11,7 @@
> > > >>   #include "hantro.h"
> > > >>   #include "hantro_hw.h"
> > > >>   #include "hantro_g1_regs.h"
> > > >> +#include "hantro_g2_regs.h"
> > > >>
> > > >>   #define HANTRO_PP_REG_WRITE(vpu, reg_name, val) \
> > > >>   { \
> > > >> @@ -99,6 +100,21 @@ static void hantro_postproc_g1_enable(struct
> > hantro_ctx
> > > > *ctx)
> > > >>    HANTRO_PP_REG_WRITE(vpu, display_width, ctx->dst_fmt.width);
> > > >>   }
> > > >>
> > > >> +static void hantro_postproc_g2_enable(struct hantro_ctx *ctx)
> > > >> +{
> > > >> +  struct hantro_dev *vpu = ctx->dev;
> > > >> +  struct vb2_v4l2_buffer *dst_buf;
> > > >> +  size_t chroma_offset = ctx->dst_fmt.width * ctx->dst_fmt.height;
> > > >> +  dma_addr_t dst_dma;
> > > >> +
> > > >> +  dst_buf = hantro_get_dst_buf(ctx);
> > > >> +  dst_dma = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0);
> > > >> +
> > > >> +  hantro_write_addr(vpu, G2_RS_OUT_LUMA_ADDR, dst_dma);
> > > >> +  hantro_write_addr(vpu, G2_RS_OUT_CHROMA_ADDR, dst_dma +
> > > > chroma_offset);
> > > >> +  hantro_reg_write(vpu, &g2_out_rs_e, 1);
> > > >> +}
> > > >> +
> > > >>   void hantro_postproc_free(struct hantro_ctx *ctx)
> > > >>   {
> > > >>    struct hantro_dev *vpu = ctx->dev;
> > > >> @@ -127,6 +143,9 @@ int hantro_postproc_alloc(struct hantro_ctx *ctx)
> > > >>    if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_H264_SLICE)
> > > >>            buf_size += hantro_h264_mv_size(ctx->dst_fmt.width,
> > > >>                                            ctx-
> > > >> dst_fmt.height);
> > > >> +  else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_VP9_FRAME)
> > > >> +          buf_size += hantro_vp9_mv_size(ctx->dst_fmt.width,
> > > >> +                                         ctx-
> > > >> dst_fmt.height);
> > > >>
> > > >>    for (i = 0; i < num_buffers; ++i) {
> > > >>            struct hantro_aux_buf *priv = &ctx->postproc.dec_q[i];
> > > >> @@ -152,6 +171,13 @@ static void hantro_postproc_g1_disable(struct
> > > > hantro_ctx *ctx)
> > > >>    HANTRO_PP_REG_WRITE_S(vpu, pipeline_en, 0x0);
> > > >>   }
> > > >>
> > > >> +static void hantro_postproc_g2_disable(struct hantro_ctx *ctx)
> > > >> +{
> > > >> +  struct hantro_dev *vpu = ctx->dev;
> > > >> +
> > > >> +  hantro_reg_write(vpu, &g2_out_rs_e, 0);
> > > >> +}
> > > >> +
> > > >>   void hantro_postproc_disable(struct hantro_ctx *ctx)
> > > >>   {
> > > >>    struct hantro_dev *vpu = ctx->dev;
> > > >> @@ -172,3 +198,8 @@ const struct hantro_postproc_ops
> > hantro_g1_postproc_ops
> > > > = {
> > > >>    .enable = hantro_postproc_g1_enable,
> > > >>    .disable = hantro_postproc_g1_disable,
> > > >>   };
> > > >> +
> > > >> +const struct hantro_postproc_ops hantro_g2_postproc_ops = {
> > > >> +  .enable = hantro_postproc_g2_enable,
> > > >> +  .disable = hantro_postproc_g2_disable,
> > > >> +};
> > > >> diff --git a/drivers/staging/media/hantro/imx8m_vpu_hw.c b/drivers/
> > staging/
> > > > media/hantro/imx8m_vpu_hw.c
> > > >> index 455a107ffb02..1a43f6fceef9 100644
> > > >> --- a/drivers/staging/media/hantro/imx8m_vpu_hw.c
> > > >> +++ b/drivers/staging/media/hantro/imx8m_vpu_hw.c
> > > >> @@ -132,6 +132,14 @@ static const struct hantro_fmt 
imx8m_vpu_dec_fmts[]
> > = {
> > > >>    },
> > > >>   };
> > > >>
> > > >> +static const struct hantro_fmt imx8m_vpu_g2_postproc_fmts[] = {
> > > >> +  {
> > > >> +          .fourcc = V4L2_PIX_FMT_NV12,
> > > >> +          .codec_mode = HANTRO_MODE_NONE,
> > > >> +          .postprocessed = true,
> > > >> +  },
> > > >> +};
> > > >> +
> > > >>   static const struct hantro_fmt imx8m_vpu_g2_dec_fmts[] = {
> > > >>    {
> > > >>            .fourcc = V4L2_PIX_FMT_NV12_4L4,
> > > >> @@ -301,6 +309,9 @@ const struct hantro_variant imx8mq_vpu_g2_variant 
= {
> > > >>    .dec_offset = 0x0,
> > > >>    .dec_fmts = imx8m_vpu_g2_dec_fmts,
> > > >>    .num_dec_fmts = ARRAY_SIZE(imx8m_vpu_g2_dec_fmts),
> > > >> +  .postproc_fmts = imx8m_vpu_g2_postproc_fmts,
> > > >> +  .num_postproc_fmts = ARRAY_SIZE(imx8m_vpu_g2_postproc_fmts),
> > > >> +  .postproc_ops = &hantro_g2_postproc_ops,
> > > >>    .codec = HANTRO_HEVC_DECODER | HANTRO_VP9_DECODER,
> > > >>    .codec_ops = imx8mq_vpu_g2_codec_ops,
> > > >>    .init = imx8mq_vpu_hw_init,
> > > >> --
> > > >> 2.17.1
> > > >>
> > > >>
> > > >
> > > >
> > >
> > >
> >
> >
> 



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Re: Re: [PATCH v7 11/11] media: hantro: Support NV12 on the G2 core
  2021-10-20 15:04           ` Jernej Škrabec
@ 2021-10-20 15:25             ` Ezequiel Garcia
  2021-10-21 15:36               ` Jernej Škrabec
  0 siblings, 1 reply; 37+ messages in thread
From: Ezequiel Garcia @ 2021-10-20 15:25 UTC (permalink / raw)
  To: Jernej Škrabec
  Cc: linux-media, linux-arm-kernel, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...,
	open list:STAGING SUBSYSTEM, Andrzej Pietrasiewicz,
	Benjamin Gaignard, Boris Brezillon, Fabio Estevam,
	Greg Kroah-Hartman, Hans Verkuil, Heiko Stuebner,
	Mauro Carvalho Chehab, Nicolas Dufresne, NXP Linux Team,
	Pengutronix Kernel Team, Philipp Zabel, Sascha Hauer, Shawn Guo,
	Collabora Kernel ML, Ezequiel Garcia

On Wed, 20 Oct 2021 at 12:04, Jernej Škrabec <jernej.skrabec@gmail.com> wrote:
>
> Dne sreda, 20. oktober 2021 ob 13:06:59 CEST je Ezequiel Garcia napisal(a):
> > Hi Jernej,
> >
> > On Tue, 19 Oct 2021 at 13:38, Jernej Škrabec <jernej.skrabec@gmail.com>
> wrote:
> > >
> > > Hi Andrzej!
> > >
> > > Dne petek, 15. oktober 2021 ob 19:19:47 CEST je Andrzej Pietrasiewicz
> > > napisal(a):
> > > > Hi Jernej,
> > > >
> > > > W dniu 14.10.2021 o 19:42, Jernej Škrabec pisze:
> > > > > Hi Andrzej!
> > > > >
> > > > > Dne sreda, 29. september 2021 ob 18:04:39 CEST je Andrzej
> Pietrasiewicz
> > > > > napisal(a):
> > > > >> The G2 decoder block produces NV12 4x4 tiled format (NV12_4L4).
> > > > >> Enable the G2 post-processor block, in order to produce regular NV12.
> > > > >>
> > > > >> The logic in hantro_postproc.c is leveraged to take care of
> allocating
> > > > >> the extra buffers and configure the post-processor, which is
> > > > >> significantly simpler than the one on the G1.
> > > > >
> > > > > Quick summary of discussion on LibreELEC Slack:
> > > > > When using NV12 format on Allwinner H6 variant of G2 (needs some
> driver
> > > > > changes), I get frames out of order. If I use native NV12 tiled
> format,
> > > frames
> > > > > are ordered correctly.
> > > > >
> > > > > Currently I'm not sure if this is issue with my changes or is this
> general
> > > > > issue.
> > > > >
> > > > > I would be grateful if anyone can test frame order with and without
> > > > > postprocessing enabled on imx8. Take some dynamic video with a lot of
> > > short
> > > > > scenes. It's pretty obvious when frames are out of order.
> > > > >
> > > >
> > > > I checked on imx8 and cannot observe any such artifacts.
> > >
> > > I finally found the issue. As you mentioned on Slack, register write order
> once
> > > already affected decoding. Well, it's the case again. I made hacky test and
> > > moved postproc enable call after output buffers are set and it worked. So,
> this
> > > is actually core quirk which is obviously fixed in newer variants.
> > >
> >
> > Ugh, good catch.
> >
> > What happens if you move all the calls to HANTRO_PP_REG_WRITE_S
> > (HANTRO_PP_REG_WRITE does a relaxed write)?
> >
> > Or what happens if the HANTRO_PP_REG_WRITE(vpu, out_luma_base, dst_dma)
> > is moved to be done after all the other registers?
>
> Those two macros aren't used on G2. Andrzej introduced new postproc helpers
> for G2.
>

Ah, so the issue is specific on the G2 post-processor.

> This commit solves issue for H6:
> https://github.com/jernejsk/linux-1/commit/
> a783a977c0843bb4b555dc9d0b5d64915cd219e7
>

Right, but see this comment:

    /* Turn on pipeline mode. Must be done first. */
    HANTRO_PP_REG_WRITE_S(vpu, pipeline_en, 0x1);

I have vague recollection of why we have that comment,
but I'm reluctant to move post-proc enable to the end.
(or at least not do it on G1?).

> >
> > > This makes this series with minor adaptations completely working on H6. I
> see
> > > no reason not to merge whole series.
> > >
> >
> > Do you have plans to submit your H6 work on top of this?
>
> Of course, why would I work on this otherwise? :) But before I do that, I have
> to clean up and split one commit, which adapts VP9 G2 code for H6 variant.
>

OK, sounds good.

> If you're interested in changes, take a look here:
> https://github.com/jernejsk/linux-1/commits/vp9
>

Will take a look.

Thanks,
Ezequiel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Re: Re: Re: [PATCH v7 11/11] media: hantro: Support NV12 on the G2 core
  2021-10-20 15:25             ` Ezequiel Garcia
@ 2021-10-21 15:36               ` Jernej Škrabec
  0 siblings, 0 replies; 37+ messages in thread
From: Jernej Škrabec @ 2021-10-21 15:36 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: linux-media, linux-arm-kernel, Linux Kernel Mailing List,
	open list:ARM/Rockchip SoC...,
	open list:STAGING SUBSYSTEM, Andrzej Pietrasiewicz,
	Benjamin Gaignard, Boris Brezillon, Fabio Estevam,
	Greg Kroah-Hartman, Hans Verkuil, Heiko Stuebner,
	Mauro Carvalho Chehab, Nicolas Dufresne, NXP Linux Team,
	Pengutronix Kernel Team, Philipp Zabel, Sascha Hauer, Shawn Guo,
	Collabora Kernel ML, Ezequiel Garcia

Dne sreda, 20. oktober 2021 ob 17:25:40 CEST je Ezequiel Garcia napisal(a):
> On Wed, 20 Oct 2021 at 12:04, Jernej Škrabec <jernej.skrabec@gmail.com> 
wrote:
> >
> > Dne sreda, 20. oktober 2021 ob 13:06:59 CEST je Ezequiel Garcia 
napisal(a):
> > > Hi Jernej,
> > >
> > > On Tue, 19 Oct 2021 at 13:38, Jernej Škrabec <jernej.skrabec@gmail.com>
> > wrote:
> > > >
> > > > Hi Andrzej!
> > > >
> > > > Dne petek, 15. oktober 2021 ob 19:19:47 CEST je Andrzej Pietrasiewicz
> > > > napisal(a):
> > > > > Hi Jernej,
> > > > >
> > > > > W dniu 14.10.2021 o 19:42, Jernej Škrabec pisze:
> > > > > > Hi Andrzej!
> > > > > >
> > > > > > Dne sreda, 29. september 2021 ob 18:04:39 CEST je Andrzej
> > Pietrasiewicz
> > > > > > napisal(a):
> > > > > >> The G2 decoder block produces NV12 4x4 tiled format (NV12_4L4).
> > > > > >> Enable the G2 post-processor block, in order to produce regular 
NV12.
> > > > > >>
> > > > > >> The logic in hantro_postproc.c is leveraged to take care of
> > allocating
> > > > > >> the extra buffers and configure the post-processor, which is
> > > > > >> significantly simpler than the one on the G1.
> > > > > >
> > > > > > Quick summary of discussion on LibreELEC Slack:
> > > > > > When using NV12 format on Allwinner H6 variant of G2 (needs some
> > driver
> > > > > > changes), I get frames out of order. If I use native NV12 tiled
> > format,
> > > > frames
> > > > > > are ordered correctly.
> > > > > >
> > > > > > Currently I'm not sure if this is issue with my changes or is this
> > general
> > > > > > issue.
> > > > > >
> > > > > > I would be grateful if anyone can test frame order with and 
without
> > > > > > postprocessing enabled on imx8. Take some dynamic video with a lot 
of
> > > > short
> > > > > > scenes. It's pretty obvious when frames are out of order.
> > > > > >
> > > > >
> > > > > I checked on imx8 and cannot observe any such artifacts.
> > > >
> > > > I finally found the issue. As you mentioned on Slack, register write 
order
> > once
> > > > already affected decoding. Well, it's the case again. I made hacky test 
and
> > > > moved postproc enable call after output buffers are set and it worked. 
So,
> > this
> > > > is actually core quirk which is obviously fixed in newer variants.
> > > >
> > >
> > > Ugh, good catch.
> > >
> > > What happens if you move all the calls to HANTRO_PP_REG_WRITE_S
> > > (HANTRO_PP_REG_WRITE does a relaxed write)?
> > >
> > > Or what happens if the HANTRO_PP_REG_WRITE(vpu, out_luma_base, dst_dma)
> > > is moved to be done after all the other registers?
> >
> > Those two macros aren't used on G2. Andrzej introduced new postproc 
helpers
> > for G2.
> >
> 
> Ah, so the issue is specific on the G2 post-processor.

To be more precise, issue is specific only to old G2 post-processor, found in 
Allwinner H6. Andrzej tested code with newer G2 core and both locations worked 
fine.

> 
> > This commit solves issue for H6:
> > https://github.com/jernejsk/linux-1/commit/
> > a783a977c0843bb4b555dc9d0b5d64915cd219e7
> >
> 
> Right, but see this comment:
> 
>     /* Turn on pipeline mode. Must be done first. */
>     HANTRO_PP_REG_WRITE_S(vpu, pipeline_en, 0x1);
> 
> I have vague recollection of why we have that comment,
> but I'm reluctant to move post-proc enable to the end.
> (or at least not do it on G1?).

I missed that. Any idea what would be the cleanest way to move code for G2 
only? I can only think of quirk flag in platform specific structure.

Best regards,
Jernej

> 
> > >
> > > > This makes this series with minor adaptations completely working on 
H6. I
> > see
> > > > no reason not to merge whole series.
> > > >
> > >
> > > Do you have plans to submit your H6 work on top of this?
> >
> > Of course, why would I work on this otherwise? :) But before I do that, I 
have
> > to clean up and split one commit, which adapts VP9 G2 code for H6 variant.
> >
> 
> OK, sounds good.
> 
> > If you're interested in changes, take a look here:
> > https://github.com/jernejsk/linux-1/commits/vp9
> >
> 
> Will take a look.
> 
> Thanks,
> Ezequiel
> 



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 00/11] VP9 codec V4L2 control interface
  2021-09-29 16:04 [PATCH v7 00/11] VP9 codec V4L2 control interface Andrzej Pietrasiewicz
                   ` (11 preceding siblings ...)
  2021-10-19 17:55 ` [PATCH v7 00/11] VP9 codec V4L2 control interface Ezequiel Garcia
@ 2021-11-11 14:44 ` Hans Verkuil
  2021-11-12 15:27   ` Nicolas Dufresne
  2021-11-15 15:07 ` Hans Verkuil
  13 siblings, 1 reply; 37+ messages in thread
From: Hans Verkuil @ 2021-11-11 14:44 UTC (permalink / raw)
  To: Andrzej Pietrasiewicz, linux-media, linux-arm-kernel,
	linux-kernel, linux-rockchip, linux-staging
  Cc: Benjamin Gaignard, Boris Brezillon, Ezequiel Garcia,
	Fabio Estevam, Greg Kroah-Hartman, Heiko Stuebner,
	Jernej Skrabec, Mauro Carvalho Chehab, Nicolas Dufresne,
	NXP Linux Team, Pengutronix Kernel Team, Philipp Zabel,
	Sascha Hauer, Shawn Guo, kernel

Hi all,

Andrzej, Jernej, Nicolas, if none of you (or anyone else for that matter)
objects, then I'll make a PR for this early next week.

Regards,

	Hans

On 29/09/2021 18:04, Andrzej Pietrasiewicz wrote:
> Dear all,
> 
> This patch series adds VP9 codec V4L2 control interface and two drivers
> using the new controls. It is a follow-up of previous v6 series [1].
> 
> In this iteration, we've implemented VP9 hardware decoding on two devices:
> Rockchip VDEC and Hantro G2, and tested on RK3399, i.MX8MQ and i.MX8MP.
> The i.MX8M driver needs proper power domains support, though, which is a
> subject of a different effort, but in all 3 cases we were able to run the
> drivers.
> 
> GStreamer support is also available, the needed changes have been submitted
> by Daniel Almeida [2]. This MR is ready to be merged, and just needs the
> VP9 V4L2 controls to be merged and released.
> 
> Both rkvdec and hantro drivers are passing a significant number of VP9 tests
> using Fluster[3]. There are still a few tests that are not passing, due to
> dynamic frame resize (not yet supported by V4L2) and small size videos
> (due to IP block limitations).
> 
> The series adds the VP9 codec V4L2 control API as uAPI, so it aims at being
> merged without passing through staging, as agreed[4]. The ABI has been checked
> for padding and verified to contain no holes.
> 
> [1] https://patchwork.linuxtv.org/project/linux-media/list/?series=6377
> [2] https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2144
> [3] https://github.com/fluendo/fluster
> [4] https://lore.kernel.org/linux-media/b8f83c93-67fd-09f5-9314-15746cbfdc61@xs4all.nl/
> 
> The series depends on the YUV tiled format support prepared by Ezequiel:
> https://www.spinics.net/lists/linux-media/msg197047.html
> 
> Rebased onto latest media_tree.
> 
> Changes related to v6:
> - moved setting tile filter and tile bsd auxiliary buffer addresses so
> that they are always set, even if no tiles are used (thanks, Jernej)
> - added a comment near the place where the 32-bit DMA mask is applied
>   (thanks, Nicolas)
> - improved consistency in register names (thanks, Nicolas)
> 
> Changes related to v5:
> - improved the doc comments as per Ezequiel's review (thanks, Ezequiel)
> - improved pdf output of documentation
> - added Benjamin's Reviewed-by (thanks, Benjamin)
> 
> Changes related to v4:
> - removed unused enum v4l2_vp9_intra_prediction_mode
> - converted remaining enums to defines to follow the convention
> - improved the documentation, in particular better documented how to use segmentation 
> features
> 
> Changes related to v3:
> 
> Apply suggestions from Jernej's review (thanks, Jernej):
> - renamed a control and two structs:
> 	V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR_PROBS =>
> 		V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR
> 	v4l2_ctrl_vp9_compressed_hdr_probs =>
> 		v4l2_ctrl_vp9_compressed_hdr
> 	v4l2_vp9_mv_compressed_hdr_probs => v4l2_vp9_mv_probs
> - moved tx_mode to v4l2_ctrl_vp9_compressed_hdr
> - fixed enum v4l2_vp9_ref_frame_sign_bias values (which are used to test a bitfield)
> - explicitly assigned values to all other vp9 enums
> 
> Apply suggestion from Nicolas's review (thanks, Nicolas):
> - explicitly stated that the v4l2_ctrl_vp9_compressed_hdr control is optional
> and implemented only by drivers which need it
> 
> Changes related to the RFC v2:
> 
> - added another driver including a postprocessor to de-tile
>         codec-specific tiling
> - reworked uAPI structs layout to follow VP8 style
> - changed validation of loop filter params
> - changed validation of segmentation params
> - changed validation of VP9 frame params
> - removed level lookup array from loop filter struct
>         (can be computed by drivers)
> - renamed some enum values to match the spec more closely
> - V4L2 VP9 library changed the 'eob' member of
>         'struct v4l2_vp9_frame_symbol_counts' so that it is an array
>         of pointers instead of an array of pointers to arrays
>         (IPs such as g2 creatively pass parts of the 'eob' counts in
>         the 'coeff' counts)
> - factored out several repeated portions of code
> - minor nitpicks and cleanups
> 
> Andrzej Pietrasiewicz (6):
>   media: uapi: Add VP9 stateless decoder controls
>   media: Add VP9 v4l2 library
>   media: hantro: Rename registers
>   media: hantro: Prepare for other G2 codecs
>   media: hantro: Support VP9 on the G2 core
>   media: hantro: Support NV12 on the G2 core
> 
> Boris Brezillon (1):
>   media: rkvdec: Add the VP9 backend
> 
> Ezequiel Garcia (4):
>   hantro: postproc: Fix motion vector space size
>   hantro: postproc: Introduce struct hantro_postproc_ops
>   hantro: Simplify postprocessor
>   hantro: Add quirk for NV12/NV12_4L4 capture format
> 
>  .../userspace-api/media/v4l/biblio.rst        |   10 +
>  .../media/v4l/ext-ctrls-codec-stateless.rst   |  573 +++++
>  .../media/v4l/pixfmt-compressed.rst           |   15 +
>  .../media/v4l/vidioc-g-ext-ctrls.rst          |    8 +
>  .../media/v4l/vidioc-queryctrl.rst            |   12 +
>  .../media/videodev2.h.rst.exceptions          |    2 +
>  drivers/media/v4l2-core/Kconfig               |    4 +
>  drivers/media/v4l2-core/Makefile              |    1 +
>  drivers/media/v4l2-core/v4l2-ctrls-core.c     |  180 ++
>  drivers/media/v4l2-core/v4l2-ctrls-defs.c     |    8 +
>  drivers/media/v4l2-core/v4l2-ioctl.c          |    1 +
>  drivers/media/v4l2-core/v4l2-vp9.c            | 1850 +++++++++++++++++
>  drivers/staging/media/hantro/Kconfig          |    1 +
>  drivers/staging/media/hantro/Makefile         |    7 +-
>  drivers/staging/media/hantro/hantro.h         |   40 +-
>  drivers/staging/media/hantro/hantro_drv.c     |   23 +-
>  drivers/staging/media/hantro/hantro_g2.c      |   27 +
>  .../staging/media/hantro/hantro_g2_hevc_dec.c |   69 +-
>  drivers/staging/media/hantro/hantro_g2_regs.h |  132 +-
>  .../staging/media/hantro/hantro_g2_vp9_dec.c  |  980 +++++++++
>  drivers/staging/media/hantro/hantro_hw.h      |   83 +-
>  .../staging/media/hantro/hantro_postproc.c    |   79 +-
>  drivers/staging/media/hantro/hantro_v4l2.c    |   20 +
>  drivers/staging/media/hantro/hantro_vp9.c     |  240 +++
>  drivers/staging/media/hantro/hantro_vp9.h     |  103 +
>  drivers/staging/media/hantro/imx8m_vpu_hw.c   |   38 +-
>  .../staging/media/hantro/rockchip_vpu_hw.c    |    7 +-
>  .../staging/media/hantro/sama5d4_vdec_hw.c    |    3 +-
>  drivers/staging/media/rkvdec/Kconfig          |    1 +
>  drivers/staging/media/rkvdec/Makefile         |    2 +-
>  drivers/staging/media/rkvdec/rkvdec-vp9.c     | 1078 ++++++++++
>  drivers/staging/media/rkvdec/rkvdec.c         |   52 +-
>  drivers/staging/media/rkvdec/rkvdec.h         |   12 +-
>  include/media/v4l2-ctrls.h                    |    4 +
>  include/media/v4l2-vp9.h                      |  182 ++
>  include/uapi/linux/v4l2-controls.h            |  284 +++
>  include/uapi/linux/videodev2.h                |    6 +
>  37 files changed, 6033 insertions(+), 104 deletions(-)
>  create mode 100644 drivers/media/v4l2-core/v4l2-vp9.c
>  create mode 100644 drivers/staging/media/hantro/hantro_g2.c
>  create mode 100644 drivers/staging/media/hantro/hantro_g2_vp9_dec.c
>  create mode 100644 drivers/staging/media/hantro/hantro_vp9.c
>  create mode 100644 drivers/staging/media/hantro/hantro_vp9.h
>  create mode 100644 drivers/staging/media/rkvdec/rkvdec-vp9.c
>  create mode 100644 include/media/v4l2-vp9.h
> 
> 
> base-commit: e4e737bb5c170df6135a127739a9e6148ee3da82
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 00/11] VP9 codec V4L2 control interface
  2021-11-11 14:44 ` Hans Verkuil
@ 2021-11-12 15:27   ` Nicolas Dufresne
  2021-11-15 12:56     ` Andrzej Pietrasiewicz
  0 siblings, 1 reply; 37+ messages in thread
From: Nicolas Dufresne @ 2021-11-12 15:27 UTC (permalink / raw)
  To: Hans Verkuil, Andrzej Pietrasiewicz, linux-media,
	linux-arm-kernel, linux-kernel, linux-rockchip, linux-staging
  Cc: Benjamin Gaignard, Boris Brezillon, Ezequiel Garcia,
	Fabio Estevam, Greg Kroah-Hartman, Heiko Stuebner,
	Jernej Skrabec, Mauro Carvalho Chehab, NXP Linux Team,
	Pengutronix Kernel Team, Philipp Zabel, Sascha Hauer, Shawn Guo,
	kernel

Hi Hans,

Le jeudi 11 novembre 2021 à 15:44 +0100, Hans Verkuil a écrit :
> Hi all,
> 
> Andrzej, Jernej, Nicolas, if none of you (or anyone else for that matter)
> objects, then I'll make a PR for this early next week.

I have no objection. I've myself delayed replying as we have been digging a lot
into our compliance failures, but I believe we have explained most of them by
now and nothing seems to be related to the API.

regards,
Nicolas

> 
> Regards,
> 
> 	Hans
> 
> On 29/09/2021 18:04, Andrzej Pietrasiewicz wrote:
> > Dear all,
> > 
> > This patch series adds VP9 codec V4L2 control interface and two drivers
> > using the new controls. It is a follow-up of previous v6 series [1].
> > 
> > In this iteration, we've implemented VP9 hardware decoding on two devices:
> > Rockchip VDEC and Hantro G2, and tested on RK3399, i.MX8MQ and i.MX8MP.
> > The i.MX8M driver needs proper power domains support, though, which is a
> > subject of a different effort, but in all 3 cases we were able to run the
> > drivers.
> > 
> > GStreamer support is also available, the needed changes have been submitted
> > by Daniel Almeida [2]. This MR is ready to be merged, and just needs the
> > VP9 V4L2 controls to be merged and released.
> > 
> > Both rkvdec and hantro drivers are passing a significant number of VP9 tests
> > using Fluster[3]. There are still a few tests that are not passing, due to
> > dynamic frame resize (not yet supported by V4L2) and small size videos
> > (due to IP block limitations).
> > 
> > The series adds the VP9 codec V4L2 control API as uAPI, so it aims at being
> > merged without passing through staging, as agreed[4]. The ABI has been checked
> > for padding and verified to contain no holes.
> > 
> > [1] https://patchwork.linuxtv.org/project/linux-media/list/?series=6377
> > [2] https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2144
> > [3] https://github.com/fluendo/fluster
> > [4] https://lore.kernel.org/linux-media/b8f83c93-67fd-09f5-9314-15746cbfdc61@xs4all.nl/
> > 
> > The series depends on the YUV tiled format support prepared by Ezequiel:
> > https://www.spinics.net/lists/linux-media/msg197047.html
> > 
> > Rebased onto latest media_tree.
> > 
> > Changes related to v6:
> > - moved setting tile filter and tile bsd auxiliary buffer addresses so
> > that they are always set, even if no tiles are used (thanks, Jernej)
> > - added a comment near the place where the 32-bit DMA mask is applied
> >   (thanks, Nicolas)
> > - improved consistency in register names (thanks, Nicolas)
> > 
> > Changes related to v5:
> > - improved the doc comments as per Ezequiel's review (thanks, Ezequiel)
> > - improved pdf output of documentation
> > - added Benjamin's Reviewed-by (thanks, Benjamin)
> > 
> > Changes related to v4:
> > - removed unused enum v4l2_vp9_intra_prediction_mode
> > - converted remaining enums to defines to follow the convention
> > - improved the documentation, in particular better documented how to use segmentation 
> > features
> > 
> > Changes related to v3:
> > 
> > Apply suggestions from Jernej's review (thanks, Jernej):
> > - renamed a control and two structs:
> > 	V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR_PROBS =>
> > 		V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR
> > 	v4l2_ctrl_vp9_compressed_hdr_probs =>
> > 		v4l2_ctrl_vp9_compressed_hdr
> > 	v4l2_vp9_mv_compressed_hdr_probs => v4l2_vp9_mv_probs
> > - moved tx_mode to v4l2_ctrl_vp9_compressed_hdr
> > - fixed enum v4l2_vp9_ref_frame_sign_bias values (which are used to test a bitfield)
> > - explicitly assigned values to all other vp9 enums
> > 
> > Apply suggestion from Nicolas's review (thanks, Nicolas):
> > - explicitly stated that the v4l2_ctrl_vp9_compressed_hdr control is optional
> > and implemented only by drivers which need it
> > 
> > Changes related to the RFC v2:
> > 
> > - added another driver including a postprocessor to de-tile
> >         codec-specific tiling
> > - reworked uAPI structs layout to follow VP8 style
> > - changed validation of loop filter params
> > - changed validation of segmentation params
> > - changed validation of VP9 frame params
> > - removed level lookup array from loop filter struct
> >         (can be computed by drivers)
> > - renamed some enum values to match the spec more closely
> > - V4L2 VP9 library changed the 'eob' member of
> >         'struct v4l2_vp9_frame_symbol_counts' so that it is an array
> >         of pointers instead of an array of pointers to arrays
> >         (IPs such as g2 creatively pass parts of the 'eob' counts in
> >         the 'coeff' counts)
> > - factored out several repeated portions of code
> > - minor nitpicks and cleanups
> > 
> > Andrzej Pietrasiewicz (6):
> >   media: uapi: Add VP9 stateless decoder controls
> >   media: Add VP9 v4l2 library
> >   media: hantro: Rename registers
> >   media: hantro: Prepare for other G2 codecs
> >   media: hantro: Support VP9 on the G2 core
> >   media: hantro: Support NV12 on the G2 core
> > 
> > Boris Brezillon (1):
> >   media: rkvdec: Add the VP9 backend
> > 
> > Ezequiel Garcia (4):
> >   hantro: postproc: Fix motion vector space size
> >   hantro: postproc: Introduce struct hantro_postproc_ops
> >   hantro: Simplify postprocessor
> >   hantro: Add quirk for NV12/NV12_4L4 capture format
> > 
> >  .../userspace-api/media/v4l/biblio.rst        |   10 +
> >  .../media/v4l/ext-ctrls-codec-stateless.rst   |  573 +++++
> >  .../media/v4l/pixfmt-compressed.rst           |   15 +
> >  .../media/v4l/vidioc-g-ext-ctrls.rst          |    8 +
> >  .../media/v4l/vidioc-queryctrl.rst            |   12 +
> >  .../media/videodev2.h.rst.exceptions          |    2 +
> >  drivers/media/v4l2-core/Kconfig               |    4 +
> >  drivers/media/v4l2-core/Makefile              |    1 +
> >  drivers/media/v4l2-core/v4l2-ctrls-core.c     |  180 ++
> >  drivers/media/v4l2-core/v4l2-ctrls-defs.c     |    8 +
> >  drivers/media/v4l2-core/v4l2-ioctl.c          |    1 +
> >  drivers/media/v4l2-core/v4l2-vp9.c            | 1850 +++++++++++++++++
> >  drivers/staging/media/hantro/Kconfig          |    1 +
> >  drivers/staging/media/hantro/Makefile         |    7 +-
> >  drivers/staging/media/hantro/hantro.h         |   40 +-
> >  drivers/staging/media/hantro/hantro_drv.c     |   23 +-
> >  drivers/staging/media/hantro/hantro_g2.c      |   27 +
> >  .../staging/media/hantro/hantro_g2_hevc_dec.c |   69 +-
> >  drivers/staging/media/hantro/hantro_g2_regs.h |  132 +-
> >  .../staging/media/hantro/hantro_g2_vp9_dec.c  |  980 +++++++++
> >  drivers/staging/media/hantro/hantro_hw.h      |   83 +-
> >  .../staging/media/hantro/hantro_postproc.c    |   79 +-
> >  drivers/staging/media/hantro/hantro_v4l2.c    |   20 +
> >  drivers/staging/media/hantro/hantro_vp9.c     |  240 +++
> >  drivers/staging/media/hantro/hantro_vp9.h     |  103 +
> >  drivers/staging/media/hantro/imx8m_vpu_hw.c   |   38 +-
> >  .../staging/media/hantro/rockchip_vpu_hw.c    |    7 +-
> >  .../staging/media/hantro/sama5d4_vdec_hw.c    |    3 +-
> >  drivers/staging/media/rkvdec/Kconfig          |    1 +
> >  drivers/staging/media/rkvdec/Makefile         |    2 +-
> >  drivers/staging/media/rkvdec/rkvdec-vp9.c     | 1078 ++++++++++
> >  drivers/staging/media/rkvdec/rkvdec.c         |   52 +-
> >  drivers/staging/media/rkvdec/rkvdec.h         |   12 +-
> >  include/media/v4l2-ctrls.h                    |    4 +
> >  include/media/v4l2-vp9.h                      |  182 ++
> >  include/uapi/linux/v4l2-controls.h            |  284 +++
> >  include/uapi/linux/videodev2.h                |    6 +
> >  37 files changed, 6033 insertions(+), 104 deletions(-)
> >  create mode 100644 drivers/media/v4l2-core/v4l2-vp9.c
> >  create mode 100644 drivers/staging/media/hantro/hantro_g2.c
> >  create mode 100644 drivers/staging/media/hantro/hantro_g2_vp9_dec.c
> >  create mode 100644 drivers/staging/media/hantro/hantro_vp9.c
> >  create mode 100644 drivers/staging/media/hantro/hantro_vp9.h
> >  create mode 100644 drivers/staging/media/rkvdec/rkvdec-vp9.c
> >  create mode 100644 include/media/v4l2-vp9.h
> > 
> > 
> > base-commit: e4e737bb5c170df6135a127739a9e6148ee3da82
> > 
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 00/11] VP9 codec V4L2 control interface
  2021-11-12 15:27   ` Nicolas Dufresne
@ 2021-11-15 12:56     ` Andrzej Pietrasiewicz
  2021-11-15 13:09       ` Andrzej Pietrasiewicz
  0 siblings, 1 reply; 37+ messages in thread
From: Andrzej Pietrasiewicz @ 2021-11-15 12:56 UTC (permalink / raw)
  To: Nicolas Dufresne, Hans Verkuil, linux-media, linux-arm-kernel,
	linux-kernel, linux-rockchip, linux-staging
  Cc: Benjamin Gaignard, Boris Brezillon, Ezequiel Garcia,
	Fabio Estevam, Greg Kroah-Hartman, Heiko Stuebner,
	Jernej Skrabec, Mauro Carvalho Chehab, NXP Linux Team,
	Pengutronix Kernel Team, Philipp Zabel, Sascha Hauer, Shawn Guo,
	kernel

Hi Hans,

W dniu 12.11.2021 o 16:27, Nicolas Dufresne pisze:
> Hi Hans,
> 
> Le jeudi 11 novembre 2021 à 15:44 +0100, Hans Verkuil a écrit :
>> Hi all,
>>
>> Andrzej, Jernej, Nicolas, if none of you (or anyone else for that matter)
>> objects, then I'll make a PR for this early next week.
> 
> I have no objection. I've myself delayed replying as we have been digging a lot
> into our compliance failures, but I believe we have explained most of them by
> now and nothing seems to be related to the API.
> 
> regards,
> Nicolas
I'm fine with making a PR, too.

Andrzej

> 
>>
>> Regards,
>>
>> 	Hans
>>
>> On 29/09/2021 18:04, Andrzej Pietrasiewicz wrote:
>>> Dear all,
>>>
>>> This patch series adds VP9 codec V4L2 control interface and two drivers
>>> using the new controls. It is a follow-up of previous v6 series [1].
>>>
>>> In this iteration, we've implemented VP9 hardware decoding on two devices:
>>> Rockchip VDEC and Hantro G2, and tested on RK3399, i.MX8MQ and i.MX8MP.
>>> The i.MX8M driver needs proper power domains support, though, which is a
>>> subject of a different effort, but in all 3 cases we were able to run the
>>> drivers.
>>>
>>> GStreamer support is also available, the needed changes have been submitted
>>> by Daniel Almeida [2]. This MR is ready to be merged, and just needs the
>>> VP9 V4L2 controls to be merged and released.
>>>
>>> Both rkvdec and hantro drivers are passing a significant number of VP9 tests
>>> using Fluster[3]. There are still a few tests that are not passing, due to
>>> dynamic frame resize (not yet supported by V4L2) and small size videos
>>> (due to IP block limitations).
>>>
>>> The series adds the VP9 codec V4L2 control API as uAPI, so it aims at being
>>> merged without passing through staging, as agreed[4]. The ABI has been checked
>>> for padding and verified to contain no holes.
>>>
>>> [1] https://patchwork.linuxtv.org/project/linux-media/list/?series=6377
>>> [2] https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2144
>>> [3] https://github.com/fluendo/fluster
>>> [4] https://lore.kernel.org/linux-media/b8f83c93-67fd-09f5-9314-15746cbfdc61@xs4all.nl/
>>>
>>> The series depends on the YUV tiled format support prepared by Ezequiel:
>>> https://www.spinics.net/lists/linux-media/msg197047.html
>>>
>>> Rebased onto latest media_tree.
>>>
>>> Changes related to v6:
>>> - moved setting tile filter and tile bsd auxiliary buffer addresses so
>>> that they are always set, even if no tiles are used (thanks, Jernej)
>>> - added a comment near the place where the 32-bit DMA mask is applied
>>>    (thanks, Nicolas)
>>> - improved consistency in register names (thanks, Nicolas)
>>>
>>> Changes related to v5:
>>> - improved the doc comments as per Ezequiel's review (thanks, Ezequiel)
>>> - improved pdf output of documentation
>>> - added Benjamin's Reviewed-by (thanks, Benjamin)
>>>
>>> Changes related to v4:
>>> - removed unused enum v4l2_vp9_intra_prediction_mode
>>> - converted remaining enums to defines to follow the convention
>>> - improved the documentation, in particular better documented how to use segmentation
>>> features
>>>
>>> Changes related to v3:
>>>
>>> Apply suggestions from Jernej's review (thanks, Jernej):
>>> - renamed a control and two structs:
>>> 	V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR_PROBS =>
>>> 		V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR
>>> 	v4l2_ctrl_vp9_compressed_hdr_probs =>
>>> 		v4l2_ctrl_vp9_compressed_hdr
>>> 	v4l2_vp9_mv_compressed_hdr_probs => v4l2_vp9_mv_probs
>>> - moved tx_mode to v4l2_ctrl_vp9_compressed_hdr
>>> - fixed enum v4l2_vp9_ref_frame_sign_bias values (which are used to test a bitfield)
>>> - explicitly assigned values to all other vp9 enums
>>>
>>> Apply suggestion from Nicolas's review (thanks, Nicolas):
>>> - explicitly stated that the v4l2_ctrl_vp9_compressed_hdr control is optional
>>> and implemented only by drivers which need it
>>>
>>> Changes related to the RFC v2:
>>>
>>> - added another driver including a postprocessor to de-tile
>>>          codec-specific tiling
>>> - reworked uAPI structs layout to follow VP8 style
>>> - changed validation of loop filter params
>>> - changed validation of segmentation params
>>> - changed validation of VP9 frame params
>>> - removed level lookup array from loop filter struct
>>>          (can be computed by drivers)
>>> - renamed some enum values to match the spec more closely
>>> - V4L2 VP9 library changed the 'eob' member of
>>>          'struct v4l2_vp9_frame_symbol_counts' so that it is an array
>>>          of pointers instead of an array of pointers to arrays
>>>          (IPs such as g2 creatively pass parts of the 'eob' counts in
>>>          the 'coeff' counts)
>>> - factored out several repeated portions of code
>>> - minor nitpicks and cleanups
>>>
>>> Andrzej Pietrasiewicz (6):
>>>    media: uapi: Add VP9 stateless decoder controls
>>>    media: Add VP9 v4l2 library
>>>    media: hantro: Rename registers
>>>    media: hantro: Prepare for other G2 codecs
>>>    media: hantro: Support VP9 on the G2 core
>>>    media: hantro: Support NV12 on the G2 core
>>>
>>> Boris Brezillon (1):
>>>    media: rkvdec: Add the VP9 backend
>>>
>>> Ezequiel Garcia (4):
>>>    hantro: postproc: Fix motion vector space size
>>>    hantro: postproc: Introduce struct hantro_postproc_ops
>>>    hantro: Simplify postprocessor
>>>    hantro: Add quirk for NV12/NV12_4L4 capture format
>>>
>>>   .../userspace-api/media/v4l/biblio.rst        |   10 +
>>>   .../media/v4l/ext-ctrls-codec-stateless.rst   |  573 +++++
>>>   .../media/v4l/pixfmt-compressed.rst           |   15 +
>>>   .../media/v4l/vidioc-g-ext-ctrls.rst          |    8 +
>>>   .../media/v4l/vidioc-queryctrl.rst            |   12 +
>>>   .../media/videodev2.h.rst.exceptions          |    2 +
>>>   drivers/media/v4l2-core/Kconfig               |    4 +
>>>   drivers/media/v4l2-core/Makefile              |    1 +
>>>   drivers/media/v4l2-core/v4l2-ctrls-core.c     |  180 ++
>>>   drivers/media/v4l2-core/v4l2-ctrls-defs.c     |    8 +
>>>   drivers/media/v4l2-core/v4l2-ioctl.c          |    1 +
>>>   drivers/media/v4l2-core/v4l2-vp9.c            | 1850 +++++++++++++++++
>>>   drivers/staging/media/hantro/Kconfig          |    1 +
>>>   drivers/staging/media/hantro/Makefile         |    7 +-
>>>   drivers/staging/media/hantro/hantro.h         |   40 +-
>>>   drivers/staging/media/hantro/hantro_drv.c     |   23 +-
>>>   drivers/staging/media/hantro/hantro_g2.c      |   27 +
>>>   .../staging/media/hantro/hantro_g2_hevc_dec.c |   69 +-
>>>   drivers/staging/media/hantro/hantro_g2_regs.h |  132 +-
>>>   .../staging/media/hantro/hantro_g2_vp9_dec.c  |  980 +++++++++
>>>   drivers/staging/media/hantro/hantro_hw.h      |   83 +-
>>>   .../staging/media/hantro/hantro_postproc.c    |   79 +-
>>>   drivers/staging/media/hantro/hantro_v4l2.c    |   20 +
>>>   drivers/staging/media/hantro/hantro_vp9.c     |  240 +++
>>>   drivers/staging/media/hantro/hantro_vp9.h     |  103 +
>>>   drivers/staging/media/hantro/imx8m_vpu_hw.c   |   38 +-
>>>   .../staging/media/hantro/rockchip_vpu_hw.c    |    7 +-
>>>   .../staging/media/hantro/sama5d4_vdec_hw.c    |    3 +-
>>>   drivers/staging/media/rkvdec/Kconfig          |    1 +
>>>   drivers/staging/media/rkvdec/Makefile         |    2 +-
>>>   drivers/staging/media/rkvdec/rkvdec-vp9.c     | 1078 ++++++++++
>>>   drivers/staging/media/rkvdec/rkvdec.c         |   52 +-
>>>   drivers/staging/media/rkvdec/rkvdec.h         |   12 +-
>>>   include/media/v4l2-ctrls.h                    |    4 +
>>>   include/media/v4l2-vp9.h                      |  182 ++
>>>   include/uapi/linux/v4l2-controls.h            |  284 +++
>>>   include/uapi/linux/videodev2.h                |    6 +
>>>   37 files changed, 6033 insertions(+), 104 deletions(-)
>>>   create mode 100644 drivers/media/v4l2-core/v4l2-vp9.c
>>>   create mode 100644 drivers/staging/media/hantro/hantro_g2.c
>>>   create mode 100644 drivers/staging/media/hantro/hantro_g2_vp9_dec.c
>>>   create mode 100644 drivers/staging/media/hantro/hantro_vp9.c
>>>   create mode 100644 drivers/staging/media/hantro/hantro_vp9.h
>>>   create mode 100644 drivers/staging/media/rkvdec/rkvdec-vp9.c
>>>   create mode 100644 include/media/v4l2-vp9.h
>>>
>>>
>>> base-commit: e4e737bb5c170df6135a127739a9e6148ee3da82
>>>
>>
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 00/11] VP9 codec V4L2 control interface
  2021-11-15 12:56     ` Andrzej Pietrasiewicz
@ 2021-11-15 13:09       ` Andrzej Pietrasiewicz
  0 siblings, 0 replies; 37+ messages in thread
From: Andrzej Pietrasiewicz @ 2021-11-15 13:09 UTC (permalink / raw)
  To: Nicolas Dufresne, Hans Verkuil, linux-media, linux-arm-kernel,
	linux-kernel, linux-rockchip, linux-staging
  Cc: Benjamin Gaignard, Boris Brezillon, Ezequiel Garcia,
	Fabio Estevam, Greg Kroah-Hartman, Heiko Stuebner,
	Jernej Skrabec, Mauro Carvalho Chehab, NXP Linux Team,
	Pengutronix Kernel Team, Philipp Zabel, Sascha Hauer, Shawn Guo,
	kernel

Hi Hans,

Let me clarify:

W dniu 15.11.2021 o 13:56, Andrzej Pietrasiewicz pisze:
> Hi Hans,
> 
> W dniu 12.11.2021 o 16:27, Nicolas Dufresne pisze:
>> Hi Hans,
>>
>> Le jeudi 11 novembre 2021 à 15:44 +0100, Hans Verkuil a écrit :
>>> Hi all,
>>>
>>> Andrzej, Jernej, Nicolas, if none of you (or anyone else for that matter)
>>> objects, then I'll make a PR for this early next week.
>>
>> I have no objection. I've myself delayed replying as we have been digging a lot
>> into our compliance failures, but I believe we have explained most of them by
>> now and nothing seems to be related to the API.
>>
>> regards,
>> Nicolas
> I'm fine with making a PR, too.
What I meant was this: "I'm fine with you making a PR."


> 
> Andrzej
> 
>>
>>>
>>> Regards,
>>>
>>>     Hans
>>>
>>> On 29/09/2021 18:04, Andrzej Pietrasiewicz wrote:
>>>> Dear all,
>>>>
>>>> This patch series adds VP9 codec V4L2 control interface and two drivers
>>>> using the new controls. It is a follow-up of previous v6 series [1].
>>>>
>>>> In this iteration, we've implemented VP9 hardware decoding on two devices:
>>>> Rockchip VDEC and Hantro G2, and tested on RK3399, i.MX8MQ and i.MX8MP.
>>>> The i.MX8M driver needs proper power domains support, though, which is a
>>>> subject of a different effort, but in all 3 cases we were able to run the
>>>> drivers.
>>>>
>>>> GStreamer support is also available, the needed changes have been submitted
>>>> by Daniel Almeida [2]. This MR is ready to be merged, and just needs the
>>>> VP9 V4L2 controls to be merged and released.
>>>>
>>>> Both rkvdec and hantro drivers are passing a significant number of VP9 tests
>>>> using Fluster[3]. There are still a few tests that are not passing, due to
>>>> dynamic frame resize (not yet supported by V4L2) and small size videos
>>>> (due to IP block limitations).
>>>>
>>>> The series adds the VP9 codec V4L2 control API as uAPI, so it aims at being
>>>> merged without passing through staging, as agreed[4]. The ABI has been checked
>>>> for padding and verified to contain no holes.
>>>>
>>>> [1] https://patchwork.linuxtv.org/project/linux-media/list/?series=6377
>>>> [2] 
>>>> https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2144
>>>> [3] https://github.com/fluendo/fluster
>>>> [4] 
>>>> https://lore.kernel.org/linux-media/b8f83c93-67fd-09f5-9314-15746cbfdc61@xs4all.nl/ 
>>>>
>>>>
>>>> The series depends on the YUV tiled format support prepared by Ezequiel:
>>>> https://www.spinics.net/lists/linux-media/msg197047.html
>>>>
>>>> Rebased onto latest media_tree.
>>>>
>>>> Changes related to v6:
>>>> - moved setting tile filter and tile bsd auxiliary buffer addresses so
>>>> that they are always set, even if no tiles are used (thanks, Jernej)
>>>> - added a comment near the place where the 32-bit DMA mask is applied
>>>>    (thanks, Nicolas)
>>>> - improved consistency in register names (thanks, Nicolas)
>>>>
>>>> Changes related to v5:
>>>> - improved the doc comments as per Ezequiel's review (thanks, Ezequiel)
>>>> - improved pdf output of documentation
>>>> - added Benjamin's Reviewed-by (thanks, Benjamin)
>>>>
>>>> Changes related to v4:
>>>> - removed unused enum v4l2_vp9_intra_prediction_mode
>>>> - converted remaining enums to defines to follow the convention
>>>> - improved the documentation, in particular better documented how to use 
>>>> segmentation
>>>> features
>>>>
>>>> Changes related to v3:
>>>>
>>>> Apply suggestions from Jernej's review (thanks, Jernej):
>>>> - renamed a control and two structs:
>>>>     V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR_PROBS =>
>>>>         V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR
>>>>     v4l2_ctrl_vp9_compressed_hdr_probs =>
>>>>         v4l2_ctrl_vp9_compressed_hdr
>>>>     v4l2_vp9_mv_compressed_hdr_probs => v4l2_vp9_mv_probs
>>>> - moved tx_mode to v4l2_ctrl_vp9_compressed_hdr
>>>> - fixed enum v4l2_vp9_ref_frame_sign_bias values (which are used to test a 
>>>> bitfield)
>>>> - explicitly assigned values to all other vp9 enums
>>>>
>>>> Apply suggestion from Nicolas's review (thanks, Nicolas):
>>>> - explicitly stated that the v4l2_ctrl_vp9_compressed_hdr control is optional
>>>> and implemented only by drivers which need it
>>>>
>>>> Changes related to the RFC v2:
>>>>
>>>> - added another driver including a postprocessor to de-tile
>>>>          codec-specific tiling
>>>> - reworked uAPI structs layout to follow VP8 style
>>>> - changed validation of loop filter params
>>>> - changed validation of segmentation params
>>>> - changed validation of VP9 frame params
>>>> - removed level lookup array from loop filter struct
>>>>          (can be computed by drivers)
>>>> - renamed some enum values to match the spec more closely
>>>> - V4L2 VP9 library changed the 'eob' member of
>>>>          'struct v4l2_vp9_frame_symbol_counts' so that it is an array
>>>>          of pointers instead of an array of pointers to arrays
>>>>          (IPs such as g2 creatively pass parts of the 'eob' counts in
>>>>          the 'coeff' counts)
>>>> - factored out several repeated portions of code
>>>> - minor nitpicks and cleanups
>>>>
>>>> Andrzej Pietrasiewicz (6):
>>>>    media: uapi: Add VP9 stateless decoder controls
>>>>    media: Add VP9 v4l2 library
>>>>    media: hantro: Rename registers
>>>>    media: hantro: Prepare for other G2 codecs
>>>>    media: hantro: Support VP9 on the G2 core
>>>>    media: hantro: Support NV12 on the G2 core
>>>>
>>>> Boris Brezillon (1):
>>>>    media: rkvdec: Add the VP9 backend
>>>>
>>>> Ezequiel Garcia (4):
>>>>    hantro: postproc: Fix motion vector space size
>>>>    hantro: postproc: Introduce struct hantro_postproc_ops
>>>>    hantro: Simplify postprocessor
>>>>    hantro: Add quirk for NV12/NV12_4L4 capture format
>>>>
>>>>   .../userspace-api/media/v4l/biblio.rst        |   10 +
>>>>   .../media/v4l/ext-ctrls-codec-stateless.rst   |  573 +++++
>>>>   .../media/v4l/pixfmt-compressed.rst           |   15 +
>>>>   .../media/v4l/vidioc-g-ext-ctrls.rst          |    8 +
>>>>   .../media/v4l/vidioc-queryctrl.rst            |   12 +
>>>>   .../media/videodev2.h.rst.exceptions          |    2 +
>>>>   drivers/media/v4l2-core/Kconfig               |    4 +
>>>>   drivers/media/v4l2-core/Makefile              |    1 +
>>>>   drivers/media/v4l2-core/v4l2-ctrls-core.c     |  180 ++
>>>>   drivers/media/v4l2-core/v4l2-ctrls-defs.c     |    8 +
>>>>   drivers/media/v4l2-core/v4l2-ioctl.c          |    1 +
>>>>   drivers/media/v4l2-core/v4l2-vp9.c            | 1850 +++++++++++++++++
>>>>   drivers/staging/media/hantro/Kconfig          |    1 +
>>>>   drivers/staging/media/hantro/Makefile         |    7 +-
>>>>   drivers/staging/media/hantro/hantro.h         |   40 +-
>>>>   drivers/staging/media/hantro/hantro_drv.c     |   23 +-
>>>>   drivers/staging/media/hantro/hantro_g2.c      |   27 +
>>>>   .../staging/media/hantro/hantro_g2_hevc_dec.c |   69 +-
>>>>   drivers/staging/media/hantro/hantro_g2_regs.h |  132 +-
>>>>   .../staging/media/hantro/hantro_g2_vp9_dec.c  |  980 +++++++++
>>>>   drivers/staging/media/hantro/hantro_hw.h      |   83 +-
>>>>   .../staging/media/hantro/hantro_postproc.c    |   79 +-
>>>>   drivers/staging/media/hantro/hantro_v4l2.c    |   20 +
>>>>   drivers/staging/media/hantro/hantro_vp9.c     |  240 +++
>>>>   drivers/staging/media/hantro/hantro_vp9.h     |  103 +
>>>>   drivers/staging/media/hantro/imx8m_vpu_hw.c   |   38 +-
>>>>   .../staging/media/hantro/rockchip_vpu_hw.c    |    7 +-
>>>>   .../staging/media/hantro/sama5d4_vdec_hw.c    |    3 +-
>>>>   drivers/staging/media/rkvdec/Kconfig          |    1 +
>>>>   drivers/staging/media/rkvdec/Makefile         |    2 +-
>>>>   drivers/staging/media/rkvdec/rkvdec-vp9.c     | 1078 ++++++++++
>>>>   drivers/staging/media/rkvdec/rkvdec.c         |   52 +-
>>>>   drivers/staging/media/rkvdec/rkvdec.h         |   12 +-
>>>>   include/media/v4l2-ctrls.h                    |    4 +
>>>>   include/media/v4l2-vp9.h                      |  182 ++
>>>>   include/uapi/linux/v4l2-controls.h            |  284 +++
>>>>   include/uapi/linux/videodev2.h                |    6 +
>>>>   37 files changed, 6033 insertions(+), 104 deletions(-)
>>>>   create mode 100644 drivers/media/v4l2-core/v4l2-vp9.c
>>>>   create mode 100644 drivers/staging/media/hantro/hantro_g2.c
>>>>   create mode 100644 drivers/staging/media/hantro/hantro_g2_vp9_dec.c
>>>>   create mode 100644 drivers/staging/media/hantro/hantro_vp9.c
>>>>   create mode 100644 drivers/staging/media/hantro/hantro_vp9.h
>>>>   create mode 100644 drivers/staging/media/rkvdec/rkvdec-vp9.c
>>>>   create mode 100644 include/media/v4l2-vp9.h
>>>>
>>>>
>>>> base-commit: e4e737bb5c170df6135a127739a9e6148ee3da82
>>>>
>>>
>>
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 00/11] VP9 codec V4L2 control interface
  2021-09-29 16:04 [PATCH v7 00/11] VP9 codec V4L2 control interface Andrzej Pietrasiewicz
                   ` (12 preceding siblings ...)
  2021-11-11 14:44 ` Hans Verkuil
@ 2021-11-15 15:07 ` Hans Verkuil
  2021-11-15 17:14   ` Andrzej Pietrasiewicz
  13 siblings, 1 reply; 37+ messages in thread
From: Hans Verkuil @ 2021-11-15 15:07 UTC (permalink / raw)
  To: Andrzej Pietrasiewicz, linux-media, linux-arm-kernel,
	linux-kernel, linux-rockchip, linux-staging
  Cc: Benjamin Gaignard, Boris Brezillon, Ezequiel Garcia,
	Fabio Estevam, Greg Kroah-Hartman, Heiko Stuebner,
	Jernej Skrabec, Mauro Carvalho Chehab, Nicolas Dufresne,
	NXP Linux Team, Pengutronix Kernel Team, Philipp Zabel,
	Sascha Hauer, Shawn Guo, kernel

Andrzej,

Can you rebase this series on top of the master branch of
https://git.linuxtv.org/media_stage.git/ ? Unfortunately this v7 no longer
applies. Specifically "rkvdec: Add the VP9 backend" failed in a non-trivial
manner.

Regards,

	Hans

On 29/09/2021 18:04, Andrzej Pietrasiewicz wrote:
> Dear all,
> 
> This patch series adds VP9 codec V4L2 control interface and two drivers
> using the new controls. It is a follow-up of previous v6 series [1].
> 
> In this iteration, we've implemented VP9 hardware decoding on two devices:
> Rockchip VDEC and Hantro G2, and tested on RK3399, i.MX8MQ and i.MX8MP.
> The i.MX8M driver needs proper power domains support, though, which is a
> subject of a different effort, but in all 3 cases we were able to run the
> drivers.
> 
> GStreamer support is also available, the needed changes have been submitted
> by Daniel Almeida [2]. This MR is ready to be merged, and just needs the
> VP9 V4L2 controls to be merged and released.
> 
> Both rkvdec and hantro drivers are passing a significant number of VP9 tests
> using Fluster[3]. There are still a few tests that are not passing, due to
> dynamic frame resize (not yet supported by V4L2) and small size videos
> (due to IP block limitations).
> 
> The series adds the VP9 codec V4L2 control API as uAPI, so it aims at being
> merged without passing through staging, as agreed[4]. The ABI has been checked
> for padding and verified to contain no holes.
> 
> [1] https://patchwork.linuxtv.org/project/linux-media/list/?series=6377
> [2] https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2144
> [3] https://github.com/fluendo/fluster
> [4] https://lore.kernel.org/linux-media/b8f83c93-67fd-09f5-9314-15746cbfdc61@xs4all.nl/
> 
> The series depends on the YUV tiled format support prepared by Ezequiel:
> https://www.spinics.net/lists/linux-media/msg197047.html
> 
> Rebased onto latest media_tree.
> 
> Changes related to v6:
> - moved setting tile filter and tile bsd auxiliary buffer addresses so
> that they are always set, even if no tiles are used (thanks, Jernej)
> - added a comment near the place where the 32-bit DMA mask is applied
>   (thanks, Nicolas)
> - improved consistency in register names (thanks, Nicolas)
> 
> Changes related to v5:
> - improved the doc comments as per Ezequiel's review (thanks, Ezequiel)
> - improved pdf output of documentation
> - added Benjamin's Reviewed-by (thanks, Benjamin)
> 
> Changes related to v4:
> - removed unused enum v4l2_vp9_intra_prediction_mode
> - converted remaining enums to defines to follow the convention
> - improved the documentation, in particular better documented how to use segmentation 
> features
> 
> Changes related to v3:
> 
> Apply suggestions from Jernej's review (thanks, Jernej):
> - renamed a control and two structs:
> 	V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR_PROBS =>
> 		V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR
> 	v4l2_ctrl_vp9_compressed_hdr_probs =>
> 		v4l2_ctrl_vp9_compressed_hdr
> 	v4l2_vp9_mv_compressed_hdr_probs => v4l2_vp9_mv_probs
> - moved tx_mode to v4l2_ctrl_vp9_compressed_hdr
> - fixed enum v4l2_vp9_ref_frame_sign_bias values (which are used to test a bitfield)
> - explicitly assigned values to all other vp9 enums
> 
> Apply suggestion from Nicolas's review (thanks, Nicolas):
> - explicitly stated that the v4l2_ctrl_vp9_compressed_hdr control is optional
> and implemented only by drivers which need it
> 
> Changes related to the RFC v2:
> 
> - added another driver including a postprocessor to de-tile
>         codec-specific tiling
> - reworked uAPI structs layout to follow VP8 style
> - changed validation of loop filter params
> - changed validation of segmentation params
> - changed validation of VP9 frame params
> - removed level lookup array from loop filter struct
>         (can be computed by drivers)
> - renamed some enum values to match the spec more closely
> - V4L2 VP9 library changed the 'eob' member of
>         'struct v4l2_vp9_frame_symbol_counts' so that it is an array
>         of pointers instead of an array of pointers to arrays
>         (IPs such as g2 creatively pass parts of the 'eob' counts in
>         the 'coeff' counts)
> - factored out several repeated portions of code
> - minor nitpicks and cleanups
> 
> Andrzej Pietrasiewicz (6):
>   media: uapi: Add VP9 stateless decoder controls
>   media: Add VP9 v4l2 library
>   media: hantro: Rename registers
>   media: hantro: Prepare for other G2 codecs
>   media: hantro: Support VP9 on the G2 core
>   media: hantro: Support NV12 on the G2 core
> 
> Boris Brezillon (1):
>   media: rkvdec: Add the VP9 backend
> 
> Ezequiel Garcia (4):
>   hantro: postproc: Fix motion vector space size
>   hantro: postproc: Introduce struct hantro_postproc_ops
>   hantro: Simplify postprocessor
>   hantro: Add quirk for NV12/NV12_4L4 capture format
> 
>  .../userspace-api/media/v4l/biblio.rst        |   10 +
>  .../media/v4l/ext-ctrls-codec-stateless.rst   |  573 +++++
>  .../media/v4l/pixfmt-compressed.rst           |   15 +
>  .../media/v4l/vidioc-g-ext-ctrls.rst          |    8 +
>  .../media/v4l/vidioc-queryctrl.rst            |   12 +
>  .../media/videodev2.h.rst.exceptions          |    2 +
>  drivers/media/v4l2-core/Kconfig               |    4 +
>  drivers/media/v4l2-core/Makefile              |    1 +
>  drivers/media/v4l2-core/v4l2-ctrls-core.c     |  180 ++
>  drivers/media/v4l2-core/v4l2-ctrls-defs.c     |    8 +
>  drivers/media/v4l2-core/v4l2-ioctl.c          |    1 +
>  drivers/media/v4l2-core/v4l2-vp9.c            | 1850 +++++++++++++++++
>  drivers/staging/media/hantro/Kconfig          |    1 +
>  drivers/staging/media/hantro/Makefile         |    7 +-
>  drivers/staging/media/hantro/hantro.h         |   40 +-
>  drivers/staging/media/hantro/hantro_drv.c     |   23 +-
>  drivers/staging/media/hantro/hantro_g2.c      |   27 +
>  .../staging/media/hantro/hantro_g2_hevc_dec.c |   69 +-
>  drivers/staging/media/hantro/hantro_g2_regs.h |  132 +-
>  .../staging/media/hantro/hantro_g2_vp9_dec.c  |  980 +++++++++
>  drivers/staging/media/hantro/hantro_hw.h      |   83 +-
>  .../staging/media/hantro/hantro_postproc.c    |   79 +-
>  drivers/staging/media/hantro/hantro_v4l2.c    |   20 +
>  drivers/staging/media/hantro/hantro_vp9.c     |  240 +++
>  drivers/staging/media/hantro/hantro_vp9.h     |  103 +
>  drivers/staging/media/hantro/imx8m_vpu_hw.c   |   38 +-
>  .../staging/media/hantro/rockchip_vpu_hw.c    |    7 +-
>  .../staging/media/hantro/sama5d4_vdec_hw.c    |    3 +-
>  drivers/staging/media/rkvdec/Kconfig          |    1 +
>  drivers/staging/media/rkvdec/Makefile         |    2 +-
>  drivers/staging/media/rkvdec/rkvdec-vp9.c     | 1078 ++++++++++
>  drivers/staging/media/rkvdec/rkvdec.c         |   52 +-
>  drivers/staging/media/rkvdec/rkvdec.h         |   12 +-
>  include/media/v4l2-ctrls.h                    |    4 +
>  include/media/v4l2-vp9.h                      |  182 ++
>  include/uapi/linux/v4l2-controls.h            |  284 +++
>  include/uapi/linux/videodev2.h                |    6 +
>  37 files changed, 6033 insertions(+), 104 deletions(-)
>  create mode 100644 drivers/media/v4l2-core/v4l2-vp9.c
>  create mode 100644 drivers/staging/media/hantro/hantro_g2.c
>  create mode 100644 drivers/staging/media/hantro/hantro_g2_vp9_dec.c
>  create mode 100644 drivers/staging/media/hantro/hantro_vp9.c
>  create mode 100644 drivers/staging/media/hantro/hantro_vp9.h
>  create mode 100644 drivers/staging/media/rkvdec/rkvdec-vp9.c
>  create mode 100644 include/media/v4l2-vp9.h
> 
> 
> base-commit: e4e737bb5c170df6135a127739a9e6148ee3da82
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 00/11] VP9 codec V4L2 control interface
  2021-11-15 15:07 ` Hans Verkuil
@ 2021-11-15 17:14   ` Andrzej Pietrasiewicz
  2021-11-15 21:16     ` Hans Verkuil
  0 siblings, 1 reply; 37+ messages in thread
From: Andrzej Pietrasiewicz @ 2021-11-15 17:14 UTC (permalink / raw)
  To: Hans Verkuil, linux-media, linux-arm-kernel, linux-kernel,
	linux-rockchip, linux-staging
  Cc: Benjamin Gaignard, Boris Brezillon, Ezequiel Garcia,
	Fabio Estevam, Greg Kroah-Hartman, Heiko Stuebner,
	Jernej Skrabec, Mauro Carvalho Chehab, Nicolas Dufresne,
	NXP Linux Team, Pengutronix Kernel Team, Philipp Zabel,
	Sascha Hauer, Shawn Guo, kernel

Hi Hans,

W dniu 15.11.2021 o 16:07, Hans Verkuil pisze:
> Andrzej,
> 
> Can you rebase this series on top of the master branch of
> https://git.linuxtv.org/media_stage.git/ ? Unfortunately this v7 no longer
> applies. Specifically "rkvdec: Add the VP9 backend" failed in a non-trivial
> manner.

This is a branch for you:

https://gitlab.collabora.com/linux/for-upstream/-/tree/vp9-uapi

Regards,

Andrzej


> 
> Regards,
> 
> 	Hans
> 
> On 29/09/2021 18:04, Andrzej Pietrasiewicz wrote:
>> Dear all,
>>
>> This patch series adds VP9 codec V4L2 control interface and two drivers
>> using the new controls. It is a follow-up of previous v6 series [1].
>>
>> In this iteration, we've implemented VP9 hardware decoding on two devices:
>> Rockchip VDEC and Hantro G2, and tested on RK3399, i.MX8MQ and i.MX8MP.
>> The i.MX8M driver needs proper power domains support, though, which is a
>> subject of a different effort, but in all 3 cases we were able to run the
>> drivers.
>>
>> GStreamer support is also available, the needed changes have been submitted
>> by Daniel Almeida [2]. This MR is ready to be merged, and just needs the
>> VP9 V4L2 controls to be merged and released.
>>
>> Both rkvdec and hantro drivers are passing a significant number of VP9 tests
>> using Fluster[3]. There are still a few tests that are not passing, due to
>> dynamic frame resize (not yet supported by V4L2) and small size videos
>> (due to IP block limitations).
>>
>> The series adds the VP9 codec V4L2 control API as uAPI, so it aims at being
>> merged without passing through staging, as agreed[4]. The ABI has been checked
>> for padding and verified to contain no holes.
>>
>> [1] https://patchwork.linuxtv.org/project/linux-media/list/?series=6377
>> [2] https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2144
>> [3] https://github.com/fluendo/fluster
>> [4] https://lore.kernel.org/linux-media/b8f83c93-67fd-09f5-9314-15746cbfdc61@xs4all.nl/
>>
>> The series depends on the YUV tiled format support prepared by Ezequiel:
>> https://www.spinics.net/lists/linux-media/msg197047.html
>>
>> Rebased onto latest media_tree.
>>
>> Changes related to v6:
>> - moved setting tile filter and tile bsd auxiliary buffer addresses so
>> that they are always set, even if no tiles are used (thanks, Jernej)
>> - added a comment near the place where the 32-bit DMA mask is applied
>>    (thanks, Nicolas)
>> - improved consistency in register names (thanks, Nicolas)
>>
>> Changes related to v5:
>> - improved the doc comments as per Ezequiel's review (thanks, Ezequiel)
>> - improved pdf output of documentation
>> - added Benjamin's Reviewed-by (thanks, Benjamin)
>>
>> Changes related to v4:
>> - removed unused enum v4l2_vp9_intra_prediction_mode
>> - converted remaining enums to defines to follow the convention
>> - improved the documentation, in particular better documented how to use segmentation
>> features
>>
>> Changes related to v3:
>>
>> Apply suggestions from Jernej's review (thanks, Jernej):
>> - renamed a control and two structs:
>> 	V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR_PROBS =>
>> 		V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR
>> 	v4l2_ctrl_vp9_compressed_hdr_probs =>
>> 		v4l2_ctrl_vp9_compressed_hdr
>> 	v4l2_vp9_mv_compressed_hdr_probs => v4l2_vp9_mv_probs
>> - moved tx_mode to v4l2_ctrl_vp9_compressed_hdr
>> - fixed enum v4l2_vp9_ref_frame_sign_bias values (which are used to test a bitfield)
>> - explicitly assigned values to all other vp9 enums
>>
>> Apply suggestion from Nicolas's review (thanks, Nicolas):
>> - explicitly stated that the v4l2_ctrl_vp9_compressed_hdr control is optional
>> and implemented only by drivers which need it
>>
>> Changes related to the RFC v2:
>>
>> - added another driver including a postprocessor to de-tile
>>          codec-specific tiling
>> - reworked uAPI structs layout to follow VP8 style
>> - changed validation of loop filter params
>> - changed validation of segmentation params
>> - changed validation of VP9 frame params
>> - removed level lookup array from loop filter struct
>>          (can be computed by drivers)
>> - renamed some enum values to match the spec more closely
>> - V4L2 VP9 library changed the 'eob' member of
>>          'struct v4l2_vp9_frame_symbol_counts' so that it is an array
>>          of pointers instead of an array of pointers to arrays
>>          (IPs such as g2 creatively pass parts of the 'eob' counts in
>>          the 'coeff' counts)
>> - factored out several repeated portions of code
>> - minor nitpicks and cleanups
>>
>> Andrzej Pietrasiewicz (6):
>>    media: uapi: Add VP9 stateless decoder controls
>>    media: Add VP9 v4l2 library
>>    media: hantro: Rename registers
>>    media: hantro: Prepare for other G2 codecs
>>    media: hantro: Support VP9 on the G2 core
>>    media: hantro: Support NV12 on the G2 core
>>
>> Boris Brezillon (1):
>>    media: rkvdec: Add the VP9 backend
>>
>> Ezequiel Garcia (4):
>>    hantro: postproc: Fix motion vector space size
>>    hantro: postproc: Introduce struct hantro_postproc_ops
>>    hantro: Simplify postprocessor
>>    hantro: Add quirk for NV12/NV12_4L4 capture format
>>
>>   .../userspace-api/media/v4l/biblio.rst        |   10 +
>>   .../media/v4l/ext-ctrls-codec-stateless.rst   |  573 +++++
>>   .../media/v4l/pixfmt-compressed.rst           |   15 +
>>   .../media/v4l/vidioc-g-ext-ctrls.rst          |    8 +
>>   .../media/v4l/vidioc-queryctrl.rst            |   12 +
>>   .../media/videodev2.h.rst.exceptions          |    2 +
>>   drivers/media/v4l2-core/Kconfig               |    4 +
>>   drivers/media/v4l2-core/Makefile              |    1 +
>>   drivers/media/v4l2-core/v4l2-ctrls-core.c     |  180 ++
>>   drivers/media/v4l2-core/v4l2-ctrls-defs.c     |    8 +
>>   drivers/media/v4l2-core/v4l2-ioctl.c          |    1 +
>>   drivers/media/v4l2-core/v4l2-vp9.c            | 1850 +++++++++++++++++
>>   drivers/staging/media/hantro/Kconfig          |    1 +
>>   drivers/staging/media/hantro/Makefile         |    7 +-
>>   drivers/staging/media/hantro/hantro.h         |   40 +-
>>   drivers/staging/media/hantro/hantro_drv.c     |   23 +-
>>   drivers/staging/media/hantro/hantro_g2.c      |   27 +
>>   .../staging/media/hantro/hantro_g2_hevc_dec.c |   69 +-
>>   drivers/staging/media/hantro/hantro_g2_regs.h |  132 +-
>>   .../staging/media/hantro/hantro_g2_vp9_dec.c  |  980 +++++++++
>>   drivers/staging/media/hantro/hantro_hw.h      |   83 +-
>>   .../staging/media/hantro/hantro_postproc.c    |   79 +-
>>   drivers/staging/media/hantro/hantro_v4l2.c    |   20 +
>>   drivers/staging/media/hantro/hantro_vp9.c     |  240 +++
>>   drivers/staging/media/hantro/hantro_vp9.h     |  103 +
>>   drivers/staging/media/hantro/imx8m_vpu_hw.c   |   38 +-
>>   .../staging/media/hantro/rockchip_vpu_hw.c    |    7 +-
>>   .../staging/media/hantro/sama5d4_vdec_hw.c    |    3 +-
>>   drivers/staging/media/rkvdec/Kconfig          |    1 +
>>   drivers/staging/media/rkvdec/Makefile         |    2 +-
>>   drivers/staging/media/rkvdec/rkvdec-vp9.c     | 1078 ++++++++++
>>   drivers/staging/media/rkvdec/rkvdec.c         |   52 +-
>>   drivers/staging/media/rkvdec/rkvdec.h         |   12 +-
>>   include/media/v4l2-ctrls.h                    |    4 +
>>   include/media/v4l2-vp9.h                      |  182 ++
>>   include/uapi/linux/v4l2-controls.h            |  284 +++
>>   include/uapi/linux/videodev2.h                |    6 +
>>   37 files changed, 6033 insertions(+), 104 deletions(-)
>>   create mode 100644 drivers/media/v4l2-core/v4l2-vp9.c
>>   create mode 100644 drivers/staging/media/hantro/hantro_g2.c
>>   create mode 100644 drivers/staging/media/hantro/hantro_g2_vp9_dec.c
>>   create mode 100644 drivers/staging/media/hantro/hantro_vp9.c
>>   create mode 100644 drivers/staging/media/hantro/hantro_vp9.h
>>   create mode 100644 drivers/staging/media/rkvdec/rkvdec-vp9.c
>>   create mode 100644 include/media/v4l2-vp9.h
>>
>>
>> base-commit: e4e737bb5c170df6135a127739a9e6148ee3da82
>>
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 00/11] VP9 codec V4L2 control interface
  2021-11-15 17:14   ` Andrzej Pietrasiewicz
@ 2021-11-15 21:16     ` Hans Verkuil
  2021-11-16  8:09       ` Andrzej Pietrasiewicz
  0 siblings, 1 reply; 37+ messages in thread
From: Hans Verkuil @ 2021-11-15 21:16 UTC (permalink / raw)
  To: Andrzej Pietrasiewicz, linux-media, linux-arm-kernel,
	linux-kernel, linux-rockchip, linux-staging
  Cc: Benjamin Gaignard, Boris Brezillon, Ezequiel Garcia,
	Fabio Estevam, Greg Kroah-Hartman, Heiko Stuebner,
	Jernej Skrabec, Mauro Carvalho Chehab, Nicolas Dufresne,
	NXP Linux Team, Pengutronix Kernel Team, Philipp Zabel,
	Sascha Hauer, Shawn Guo, kernel

On 15/11/2021 18:14, Andrzej Pietrasiewicz wrote:
> Hi Hans,
> 
> W dniu 15.11.2021 o 16:07, Hans Verkuil pisze:
>> Andrzej,
>>
>> Can you rebase this series on top of the master branch of
>> https://git.linuxtv.org/media_stage.git/ ? Unfortunately this v7 no longer
>> applies. Specifically "rkvdec: Add the VP9 backend" failed in a non-trivial
>> manner.
> 
> This is a branch for you:
> 
> https://gitlab.collabora.com/linux/for-upstream/-/tree/vp9-uapi

I'm getting a bunch of sparse/smatch warnings:

sparse:
rkvdec/rkvdec-vp9.c:190:43: warning: variable 'dec_params' set but not used [-Wunused-but-set-variable]
rkvdec/rkvdec-vp9.c:245:43: warning: variable 'dec_params' set but not used [-Wunused-but-set-variable]
SPARSE:hantro/hantro_postproc.c hantro/hantro_postproc.c:37:35: warning: symbol 'hantro_g1_postproc_regs' was not declared. Should it be static?

smatch:
rkvdec/rkvdec-vp9.c:190:43: warning: variable 'dec_params' set but not used [-Wunused-but-set-variable]
rkvdec/rkvdec-vp9.c:245:43: warning: variable 'dec_params' set but not used [-Wunused-but-set-variable]
rkvdec/rkvdec-vp9.c: rkvdec/rkvdec-vp9.c:236 init_intra_only_probs() error: buffer overflow 'ptr' 90 <= 91
hantro/hantro_g2_vp9_dec.c: hantro/hantro_g2_vp9_dec.c:670 config_probs() error: memcpy() 'adaptive->inter_mode[i]' too small (4 vs 21)
hantro/hantro_g2_vp9_dec.c: hantro/hantro_g2_vp9_dec.c:670 config_probs() error: memcpy() 'probs->inter_mode[i]' too small (3 vs 21

Also a bunch of kerneldoc warnings:

include/media/v4l2-vp9.h:30: warning: Function parameter or member 'joint' not described in 'v4l2_vp9_frame_mv_context'
include/media/v4l2-vp9.h:30: warning: Function parameter or member 'sign' not described in 'v4l2_vp9_frame_mv_context'
include/media/v4l2-vp9.h:30: warning: Function parameter or member 'classes' not described in 'v4l2_vp9_frame_mv_context'
include/media/v4l2-vp9.h:30: warning: Function parameter or member 'class0_bit' not described in 'v4l2_vp9_frame_mv_context'
include/media/v4l2-vp9.h:30: warning: Function parameter or member 'bits' not described in 'v4l2_vp9_frame_mv_context'
include/media/v4l2-vp9.h:30: warning: Function parameter or member 'class0_fr' not described in 'v4l2_vp9_frame_mv_context'
include/media/v4l2-vp9.h:30: warning: Function parameter or member 'fr' not described in 'v4l2_vp9_frame_mv_context'
include/media/v4l2-vp9.h:30: warning: Function parameter or member 'class0_hp' not described in 'v4l2_vp9_frame_mv_context'
include/media/v4l2-vp9.h:30: warning: Function parameter or member 'hp' not described in 'v4l2_vp9_frame_mv_context'
include/media/v4l2-vp9.h:58: warning: Function parameter or member 'tx8' not described in 'v4l2_vp9_frame_context'
include/media/v4l2-vp9.h:58: warning: Function parameter or member 'tx16' not described in 'v4l2_vp9_frame_context'
include/media/v4l2-vp9.h:58: warning: Function parameter or member 'tx32' not described in 'v4l2_vp9_frame_context'
include/media/v4l2-vp9.h:58: warning: Function parameter or member 'coef' not described in 'v4l2_vp9_frame_context'
include/media/v4l2-vp9.h:58: warning: Function parameter or member 'skip' not described in 'v4l2_vp9_frame_context'
include/media/v4l2-vp9.h:58: warning: Function parameter or member 'inter_mode' not described in 'v4l2_vp9_frame_context'
include/media/v4l2-vp9.h:58: warning: Function parameter or member 'interp_filter' not described in 'v4l2_vp9_frame_context'
include/media/v4l2-vp9.h:58: warning: Function parameter or member 'is_inter' not described in 'v4l2_vp9_frame_context'
include/media/v4l2-vp9.h:58: warning: Function parameter or member 'comp_mode' not described in 'v4l2_vp9_frame_context'
include/media/v4l2-vp9.h:58: warning: Function parameter or member 'single_ref' not described in 'v4l2_vp9_frame_context'
include/media/v4l2-vp9.h:58: warning: Function parameter or member 'comp_ref' not described in 'v4l2_vp9_frame_context'
include/media/v4l2-vp9.h:58: warning: Function parameter or member 'y_mode' not described in 'v4l2_vp9_frame_context'
include/media/v4l2-vp9.h:58: warning: Function parameter or member 'uv_mode' not described in 'v4l2_vp9_frame_context'
include/media/v4l2-vp9.h:58: warning: Function parameter or member 'partition' not described in 'v4l2_vp9_frame_context'
include/media/v4l2-vp9.h:58: warning: Function parameter or member 'mv' not described in 'v4l2_vp9_frame_context'
include/media/v4l2-vp9.h:93: warning: Function parameter or member 'partition' not described in 'v4l2_vp9_frame_symbol_counts'
include/media/v4l2-vp9.h:93: warning: Function parameter or member 'skip' not described in 'v4l2_vp9_frame_symbol_counts'
include/media/v4l2-vp9.h:93: warning: Function parameter or member 'intra_inter' not described in 'v4l2_vp9_frame_symbol_counts'
include/media/v4l2-vp9.h:93: warning: Function parameter or member 'tx32p' not described in 'v4l2_vp9_frame_symbol_counts'
include/media/v4l2-vp9.h:93: warning: Function parameter or member 'tx16p' not described in 'v4l2_vp9_frame_symbol_counts'
include/media/v4l2-vp9.h:93: warning: Function parameter or member 'tx8p' not described in 'v4l2_vp9_frame_symbol_counts'
include/media/v4l2-vp9.h:93: warning: Function parameter or member 'y_mode' not described in 'v4l2_vp9_frame_symbol_counts'
include/media/v4l2-vp9.h:93: warning: Function parameter or member 'uv_mode' not described in 'v4l2_vp9_frame_symbol_counts'
include/media/v4l2-vp9.h:93: warning: Function parameter or member 'comp' not described in 'v4l2_vp9_frame_symbol_counts'
include/media/v4l2-vp9.h:93: warning: Function parameter or member 'comp_ref' not described in 'v4l2_vp9_frame_symbol_counts'
include/media/v4l2-vp9.h:93: warning: Function parameter or member 'single_ref' not described in 'v4l2_vp9_frame_symbol_counts'
include/media/v4l2-vp9.h:93: warning: Function parameter or member 'mv_mode' not described in 'v4l2_vp9_frame_symbol_counts'
include/media/v4l2-vp9.h:93: warning: Function parameter or member 'filter' not described in 'v4l2_vp9_frame_symbol_counts'
include/media/v4l2-vp9.h:93: warning: Function parameter or member 'mv_joint' not described in 'v4l2_vp9_frame_symbol_counts'
include/media/v4l2-vp9.h:93: warning: Function parameter or member 'sign' not described in 'v4l2_vp9_frame_symbol_counts'
include/media/v4l2-vp9.h:93: warning: Function parameter or member 'classes' not described in 'v4l2_vp9_frame_symbol_counts'
include/media/v4l2-vp9.h:93: warning: Function parameter or member 'class0' not described in 'v4l2_vp9_frame_symbol_counts'
include/media/v4l2-vp9.h:93: warning: Function parameter or member 'bits' not described in 'v4l2_vp9_frame_symbol_counts'
include/media/v4l2-vp9.h:93: warning: Function parameter or member 'class0_fp' not described in 'v4l2_vp9_frame_symbol_counts'
include/media/v4l2-vp9.h:93: warning: Function parameter or member 'fp' not described in 'v4l2_vp9_frame_symbol_counts'
include/media/v4l2-vp9.h:93: warning: Function parameter or member 'class0_hp' not described in 'v4l2_vp9_frame_symbol_counts'
include/media/v4l2-vp9.h:93: warning: Function parameter or member 'hp' not described in 'v4l2_vp9_frame_symbol_counts'
include/media/v4l2-vp9.h:93: warning: Function parameter or member 'coeff' not described in 'v4l2_vp9_frame_symbol_counts'
include/media/v4l2-vp9.h:93: warning: Function parameter or member 'eob' not described in 'v4l2_vp9_frame_symbol_counts'
include/media/v4l2-vp9.h:166: warning: expecting prototype for v4l2_vp9_adapt_coef_probs(). Prototype was for v4l2_vp9_adapt_noncoef_probs()
instead
drivers/media/platform/omap3isp/omap3isp.h:107: warning: Function parameter or member 'vp_clk_pol' not described in 'isp_ccp2_cfg'
drivers/media/platform/omap3isp/omap3isp.h:107: warning: Function parameter or member 'lanecfg' not described in 'isp_ccp2_cfg'
drivers/media/platform/qcom/venus/core.h:202: warning: Function parameter or member 'sys_err_done' not described in 'venus_core'
drivers/media/platform/qcom/venus/core.h:462: warning: Function parameter or member 'fw_min_cnt' not described in 'venus_inst'
drivers/media/platform/qcom/venus/core.h:462: warning: Function parameter or member 'flags' not described in 'venus_inst'
drivers/media/platform/qcom/venus/core.h:462: warning: Function parameter or member 'dpb_ids' not described in 'venus_inst'
drivers/staging/media/hantro/hantro.h:115: warning: Enum value 'HANTRO_MODE_VP9_DEC' not described in enum 'hantro_codec_mode'
drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'tile_edge' not described in 'hantro_vp9_dec_hw_ctx'
drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'segment_map' not described in 'hantro_vp9_dec_hw_ctx'
drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'misc' not described in 'hantro_vp9_dec_hw_ctx'
drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'cnts' not described in 'hantro_vp9_dec_hw_ctx'
drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'probability_tables' not described in
'hantro_vp9_dec_hw_ctx'
drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'frame_context' not described in 'hantro_vp9_dec_hw_ctx'
drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'cur' not described in 'hantro_vp9_dec_hw_ctx'
drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'last' not described in 'hantro_vp9_dec_hw_ctx'
drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'bsd_ctrl_offset' not described in 'hantro_vp9_dec_hw_ctx'
drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'segment_map_size' not described in 'hantro_vp9_dec_hw_ctx'
drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'ctx_counters_offset' not described in
'hantro_vp9_dec_hw_ctx'
drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'tile_info_offset' not described in 'hantro_vp9_dec_hw_ctx'
drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'tile_r_info' not described in 'hantro_vp9_dec_hw_ctx'
drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'tile_c_info' not described in 'hantro_vp9_dec_hw_ctx'
drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'last_tile_r' not described in 'hantro_vp9_dec_hw_ctx'
drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'last_tile_c' not described in 'hantro_vp9_dec_hw_ctx'
drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'last_sbs_r' not described in 'hantro_vp9_dec_hw_ctx'
drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'last_sbs_c' not described in 'hantro_vp9_dec_hw_ctx'
drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'active_segment' not described in 'hantro_vp9_dec_hw_ctx'
drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'feature_enabled' not described in 'hantro_vp9_dec_hw_ctx'
drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'feature_data' not described in 'hantro_vp9_dec_hw_ctx'

You can test kerneldoc yourself with: scripts/kernel-doc -none include/media/v4l2-vp9.h

Regards,

	Hans

> 
> Regards,
> 
> Andrzej
> 
> 
>>
>> Regards,
>>
>> 	Hans
>>
>> On 29/09/2021 18:04, Andrzej Pietrasiewicz wrote:
>>> Dear all,
>>>
>>> This patch series adds VP9 codec V4L2 control interface and two drivers
>>> using the new controls. It is a follow-up of previous v6 series [1].
>>>
>>> In this iteration, we've implemented VP9 hardware decoding on two devices:
>>> Rockchip VDEC and Hantro G2, and tested on RK3399, i.MX8MQ and i.MX8MP.
>>> The i.MX8M driver needs proper power domains support, though, which is a
>>> subject of a different effort, but in all 3 cases we were able to run the
>>> drivers.
>>>
>>> GStreamer support is also available, the needed changes have been submitted
>>> by Daniel Almeida [2]. This MR is ready to be merged, and just needs the
>>> VP9 V4L2 controls to be merged and released.
>>>
>>> Both rkvdec and hantro drivers are passing a significant number of VP9 tests
>>> using Fluster[3]. There are still a few tests that are not passing, due to
>>> dynamic frame resize (not yet supported by V4L2) and small size videos
>>> (due to IP block limitations).
>>>
>>> The series adds the VP9 codec V4L2 control API as uAPI, so it aims at being
>>> merged without passing through staging, as agreed[4]. The ABI has been checked
>>> for padding and verified to contain no holes.
>>>
>>> [1] https://patchwork.linuxtv.org/project/linux-media/list/?series=6377
>>> [2] https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2144
>>> [3] https://github.com/fluendo/fluster
>>> [4] https://lore.kernel.org/linux-media/b8f83c93-67fd-09f5-9314-15746cbfdc61@xs4all.nl/
>>>
>>> The series depends on the YUV tiled format support prepared by Ezequiel:
>>> https://www.spinics.net/lists/linux-media/msg197047.html
>>>
>>> Rebased onto latest media_tree.
>>>
>>> Changes related to v6:
>>> - moved setting tile filter and tile bsd auxiliary buffer addresses so
>>> that they are always set, even if no tiles are used (thanks, Jernej)
>>> - added a comment near the place where the 32-bit DMA mask is applied
>>>    (thanks, Nicolas)
>>> - improved consistency in register names (thanks, Nicolas)
>>>
>>> Changes related to v5:
>>> - improved the doc comments as per Ezequiel's review (thanks, Ezequiel)
>>> - improved pdf output of documentation
>>> - added Benjamin's Reviewed-by (thanks, Benjamin)
>>>
>>> Changes related to v4:
>>> - removed unused enum v4l2_vp9_intra_prediction_mode
>>> - converted remaining enums to defines to follow the convention
>>> - improved the documentation, in particular better documented how to use segmentation
>>> features
>>>
>>> Changes related to v3:
>>>
>>> Apply suggestions from Jernej's review (thanks, Jernej):
>>> - renamed a control and two structs:
>>> 	V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR_PROBS =>
>>> 		V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR
>>> 	v4l2_ctrl_vp9_compressed_hdr_probs =>
>>> 		v4l2_ctrl_vp9_compressed_hdr
>>> 	v4l2_vp9_mv_compressed_hdr_probs => v4l2_vp9_mv_probs
>>> - moved tx_mode to v4l2_ctrl_vp9_compressed_hdr
>>> - fixed enum v4l2_vp9_ref_frame_sign_bias values (which are used to test a bitfield)
>>> - explicitly assigned values to all other vp9 enums
>>>
>>> Apply suggestion from Nicolas's review (thanks, Nicolas):
>>> - explicitly stated that the v4l2_ctrl_vp9_compressed_hdr control is optional
>>> and implemented only by drivers which need it
>>>
>>> Changes related to the RFC v2:
>>>
>>> - added another driver including a postprocessor to de-tile
>>>          codec-specific tiling
>>> - reworked uAPI structs layout to follow VP8 style
>>> - changed validation of loop filter params
>>> - changed validation of segmentation params
>>> - changed validation of VP9 frame params
>>> - removed level lookup array from loop filter struct
>>>          (can be computed by drivers)
>>> - renamed some enum values to match the spec more closely
>>> - V4L2 VP9 library changed the 'eob' member of
>>>          'struct v4l2_vp9_frame_symbol_counts' so that it is an array
>>>          of pointers instead of an array of pointers to arrays
>>>          (IPs such as g2 creatively pass parts of the 'eob' counts in
>>>          the 'coeff' counts)
>>> - factored out several repeated portions of code
>>> - minor nitpicks and cleanups
>>>
>>> Andrzej Pietrasiewicz (6):
>>>    media: uapi: Add VP9 stateless decoder controls
>>>    media: Add VP9 v4l2 library
>>>    media: hantro: Rename registers
>>>    media: hantro: Prepare for other G2 codecs
>>>    media: hantro: Support VP9 on the G2 core
>>>    media: hantro: Support NV12 on the G2 core
>>>
>>> Boris Brezillon (1):
>>>    media: rkvdec: Add the VP9 backend
>>>
>>> Ezequiel Garcia (4):
>>>    hantro: postproc: Fix motion vector space size
>>>    hantro: postproc: Introduce struct hantro_postproc_ops
>>>    hantro: Simplify postprocessor
>>>    hantro: Add quirk for NV12/NV12_4L4 capture format
>>>
>>>   .../userspace-api/media/v4l/biblio.rst        |   10 +
>>>   .../media/v4l/ext-ctrls-codec-stateless.rst   |  573 +++++
>>>   .../media/v4l/pixfmt-compressed.rst           |   15 +
>>>   .../media/v4l/vidioc-g-ext-ctrls.rst          |    8 +
>>>   .../media/v4l/vidioc-queryctrl.rst            |   12 +
>>>   .../media/videodev2.h.rst.exceptions          |    2 +
>>>   drivers/media/v4l2-core/Kconfig               |    4 +
>>>   drivers/media/v4l2-core/Makefile              |    1 +
>>>   drivers/media/v4l2-core/v4l2-ctrls-core.c     |  180 ++
>>>   drivers/media/v4l2-core/v4l2-ctrls-defs.c     |    8 +
>>>   drivers/media/v4l2-core/v4l2-ioctl.c          |    1 +
>>>   drivers/media/v4l2-core/v4l2-vp9.c            | 1850 +++++++++++++++++
>>>   drivers/staging/media/hantro/Kconfig          |    1 +
>>>   drivers/staging/media/hantro/Makefile         |    7 +-
>>>   drivers/staging/media/hantro/hantro.h         |   40 +-
>>>   drivers/staging/media/hantro/hantro_drv.c     |   23 +-
>>>   drivers/staging/media/hantro/hantro_g2.c      |   27 +
>>>   .../staging/media/hantro/hantro_g2_hevc_dec.c |   69 +-
>>>   drivers/staging/media/hantro/hantro_g2_regs.h |  132 +-
>>>   .../staging/media/hantro/hantro_g2_vp9_dec.c  |  980 +++++++++
>>>   drivers/staging/media/hantro/hantro_hw.h      |   83 +-
>>>   .../staging/media/hantro/hantro_postproc.c    |   79 +-
>>>   drivers/staging/media/hantro/hantro_v4l2.c    |   20 +
>>>   drivers/staging/media/hantro/hantro_vp9.c     |  240 +++
>>>   drivers/staging/media/hantro/hantro_vp9.h     |  103 +
>>>   drivers/staging/media/hantro/imx8m_vpu_hw.c   |   38 +-
>>>   .../staging/media/hantro/rockchip_vpu_hw.c    |    7 +-
>>>   .../staging/media/hantro/sama5d4_vdec_hw.c    |    3 +-
>>>   drivers/staging/media/rkvdec/Kconfig          |    1 +
>>>   drivers/staging/media/rkvdec/Makefile         |    2 +-
>>>   drivers/staging/media/rkvdec/rkvdec-vp9.c     | 1078 ++++++++++
>>>   drivers/staging/media/rkvdec/rkvdec.c         |   52 +-
>>>   drivers/staging/media/rkvdec/rkvdec.h         |   12 +-
>>>   include/media/v4l2-ctrls.h                    |    4 +
>>>   include/media/v4l2-vp9.h                      |  182 ++
>>>   include/uapi/linux/v4l2-controls.h            |  284 +++
>>>   include/uapi/linux/videodev2.h                |    6 +
>>>   37 files changed, 6033 insertions(+), 104 deletions(-)
>>>   create mode 100644 drivers/media/v4l2-core/v4l2-vp9.c
>>>   create mode 100644 drivers/staging/media/hantro/hantro_g2.c
>>>   create mode 100644 drivers/staging/media/hantro/hantro_g2_vp9_dec.c
>>>   create mode 100644 drivers/staging/media/hantro/hantro_vp9.c
>>>   create mode 100644 drivers/staging/media/hantro/hantro_vp9.h
>>>   create mode 100644 drivers/staging/media/rkvdec/rkvdec-vp9.c
>>>   create mode 100644 include/media/v4l2-vp9.h
>>>
>>>
>>> base-commit: e4e737bb5c170df6135a127739a9e6148ee3da82
>>>
>>
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 00/11] VP9 codec V4L2 control interface
  2021-11-15 21:16     ` Hans Verkuil
@ 2021-11-16  8:09       ` Andrzej Pietrasiewicz
  2021-11-16  8:21         ` Hans Verkuil
  0 siblings, 1 reply; 37+ messages in thread
From: Andrzej Pietrasiewicz @ 2021-11-16  8:09 UTC (permalink / raw)
  To: Hans Verkuil, linux-media, linux-arm-kernel, linux-kernel,
	linux-rockchip, linux-staging
  Cc: Benjamin Gaignard, Boris Brezillon, Ezequiel Garcia,
	Fabio Estevam, Greg Kroah-Hartman, Heiko Stuebner,
	Jernej Skrabec, Mauro Carvalho Chehab, Nicolas Dufresne,
	NXP Linux Team, Pengutronix Kernel Team, Philipp Zabel,
	Sascha Hauer, Shawn Guo, kernel

Hi Hans,

W dniu 15.11.2021 o 22:16, Hans Verkuil pisze:
> On 15/11/2021 18:14, Andrzej Pietrasiewicz wrote:
>> Hi Hans,
>>
>> W dniu 15.11.2021 o 16:07, Hans Verkuil pisze:
>>> Andrzej,
>>>
>>> Can you rebase this series on top of the master branch of
>>> https://git.linuxtv.org/media_stage.git/ ? Unfortunately this v7 no longer
>>> applies. Specifically "rkvdec: Add the VP9 backend" failed in a non-trivial
>>> manner.
>>
>> This is a branch for you:
>>
>> https://gitlab.collabora.com/linux/for-upstream/-/tree/vp9-uapi
> 
> I'm getting a bunch of sparse/smatch warnings:
> 

Thanks for finding this, I will re-create the branch and let you know on irc.
Some of the below are "false positives, namely:

drivers/media/platform/omap3isp/omap3isp.h
drivers/media/platform/qcom/venus/core.h

which are not touched by the series.

Regards,

Andrzej

> sparse:
> rkvdec/rkvdec-vp9.c:190:43: warning: variable 'dec_params' set but not used [-Wunused-but-set-variable]
> rkvdec/rkvdec-vp9.c:245:43: warning: variable 'dec_params' set but not used [-Wunused-but-set-variable]
> SPARSE:hantro/hantro_postproc.c hantro/hantro_postproc.c:37:35: warning: symbol 'hantro_g1_postproc_regs' was not declared. Should it be static?
> 
> smatch:
> rkvdec/rkvdec-vp9.c:190:43: warning: variable 'dec_params' set but not used [-Wunused-but-set-variable]
> rkvdec/rkvdec-vp9.c:245:43: warning: variable 'dec_params' set but not used [-Wunused-but-set-variable]
> rkvdec/rkvdec-vp9.c: rkvdec/rkvdec-vp9.c:236 init_intra_only_probs() error: buffer overflow 'ptr' 90 <= 91
> hantro/hantro_g2_vp9_dec.c: hantro/hantro_g2_vp9_dec.c:670 config_probs() error: memcpy() 'adaptive->inter_mode[i]' too small (4 vs 21)
> hantro/hantro_g2_vp9_dec.c: hantro/hantro_g2_vp9_dec.c:670 config_probs() error: memcpy() 'probs->inter_mode[i]' too small (3 vs 21
> 
> Also a bunch of kerneldoc warnings:
> 
> include/media/v4l2-vp9.h:30: warning: Function parameter or member 'joint' not described in 'v4l2_vp9_frame_mv_context'
> include/media/v4l2-vp9.h:30: warning: Function parameter or member 'sign' not described in 'v4l2_vp9_frame_mv_context'
> include/media/v4l2-vp9.h:30: warning: Function parameter or member 'classes' not described in 'v4l2_vp9_frame_mv_context'
> include/media/v4l2-vp9.h:30: warning: Function parameter or member 'class0_bit' not described in 'v4l2_vp9_frame_mv_context'
> include/media/v4l2-vp9.h:30: warning: Function parameter or member 'bits' not described in 'v4l2_vp9_frame_mv_context'
> include/media/v4l2-vp9.h:30: warning: Function parameter or member 'class0_fr' not described in 'v4l2_vp9_frame_mv_context'
> include/media/v4l2-vp9.h:30: warning: Function parameter or member 'fr' not described in 'v4l2_vp9_frame_mv_context'
> include/media/v4l2-vp9.h:30: warning: Function parameter or member 'class0_hp' not described in 'v4l2_vp9_frame_mv_context'
> include/media/v4l2-vp9.h:30: warning: Function parameter or member 'hp' not described in 'v4l2_vp9_frame_mv_context'
> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'tx8' not described in 'v4l2_vp9_frame_context'
> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'tx16' not described in 'v4l2_vp9_frame_context'
> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'tx32' not described in 'v4l2_vp9_frame_context'
> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'coef' not described in 'v4l2_vp9_frame_context'
> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'skip' not described in 'v4l2_vp9_frame_context'
> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'inter_mode' not described in 'v4l2_vp9_frame_context'
> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'interp_filter' not described in 'v4l2_vp9_frame_context'
> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'is_inter' not described in 'v4l2_vp9_frame_context'
> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'comp_mode' not described in 'v4l2_vp9_frame_context'
> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'single_ref' not described in 'v4l2_vp9_frame_context'
> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'comp_ref' not described in 'v4l2_vp9_frame_context'
> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'y_mode' not described in 'v4l2_vp9_frame_context'
> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'uv_mode' not described in 'v4l2_vp9_frame_context'
> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'partition' not described in 'v4l2_vp9_frame_context'
> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'mv' not described in 'v4l2_vp9_frame_context'
> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'partition' not described in 'v4l2_vp9_frame_symbol_counts'
> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'skip' not described in 'v4l2_vp9_frame_symbol_counts'
> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'intra_inter' not described in 'v4l2_vp9_frame_symbol_counts'
> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'tx32p' not described in 'v4l2_vp9_frame_symbol_counts'
> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'tx16p' not described in 'v4l2_vp9_frame_symbol_counts'
> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'tx8p' not described in 'v4l2_vp9_frame_symbol_counts'
> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'y_mode' not described in 'v4l2_vp9_frame_symbol_counts'
> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'uv_mode' not described in 'v4l2_vp9_frame_symbol_counts'
> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'comp' not described in 'v4l2_vp9_frame_symbol_counts'
> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'comp_ref' not described in 'v4l2_vp9_frame_symbol_counts'
> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'single_ref' not described in 'v4l2_vp9_frame_symbol_counts'
> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'mv_mode' not described in 'v4l2_vp9_frame_symbol_counts'
> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'filter' not described in 'v4l2_vp9_frame_symbol_counts'
> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'mv_joint' not described in 'v4l2_vp9_frame_symbol_counts'
> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'sign' not described in 'v4l2_vp9_frame_symbol_counts'
> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'classes' not described in 'v4l2_vp9_frame_symbol_counts'
> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'class0' not described in 'v4l2_vp9_frame_symbol_counts'
> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'bits' not described in 'v4l2_vp9_frame_symbol_counts'
> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'class0_fp' not described in 'v4l2_vp9_frame_symbol_counts'
> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'fp' not described in 'v4l2_vp9_frame_symbol_counts'
> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'class0_hp' not described in 'v4l2_vp9_frame_symbol_counts'
> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'hp' not described in 'v4l2_vp9_frame_symbol_counts'
> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'coeff' not described in 'v4l2_vp9_frame_symbol_counts'
> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'eob' not described in 'v4l2_vp9_frame_symbol_counts'
> include/media/v4l2-vp9.h:166: warning: expecting prototype for v4l2_vp9_adapt_coef_probs(). Prototype was for v4l2_vp9_adapt_noncoef_probs()
> instead
> drivers/media/platform/omap3isp/omap3isp.h:107: warning: Function parameter or member 'vp_clk_pol' not described in 'isp_ccp2_cfg'
> drivers/media/platform/omap3isp/omap3isp.h:107: warning: Function parameter or member 'lanecfg' not described in 'isp_ccp2_cfg'
> drivers/media/platform/qcom/venus/core.h:202: warning: Function parameter or member 'sys_err_done' not described in 'venus_core'
> drivers/media/platform/qcom/venus/core.h:462: warning: Function parameter or member 'fw_min_cnt' not described in 'venus_inst'
> drivers/media/platform/qcom/venus/core.h:462: warning: Function parameter or member 'flags' not described in 'venus_inst'
> drivers/media/platform/qcom/venus/core.h:462: warning: Function parameter or member 'dpb_ids' not described in 'venus_inst'
> drivers/staging/media/hantro/hantro.h:115: warning: Enum value 'HANTRO_MODE_VP9_DEC' not described in enum 'hantro_codec_mode'
> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'tile_edge' not described in 'hantro_vp9_dec_hw_ctx'
> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'segment_map' not described in 'hantro_vp9_dec_hw_ctx'
> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'misc' not described in 'hantro_vp9_dec_hw_ctx'
> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'cnts' not described in 'hantro_vp9_dec_hw_ctx'
> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'probability_tables' not described in
> 'hantro_vp9_dec_hw_ctx'
> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'frame_context' not described in 'hantro_vp9_dec_hw_ctx'
> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'cur' not described in 'hantro_vp9_dec_hw_ctx'
> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'last' not described in 'hantro_vp9_dec_hw_ctx'
> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'bsd_ctrl_offset' not described in 'hantro_vp9_dec_hw_ctx'
> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'segment_map_size' not described in 'hantro_vp9_dec_hw_ctx'
> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'ctx_counters_offset' not described in
> 'hantro_vp9_dec_hw_ctx'
> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'tile_info_offset' not described in 'hantro_vp9_dec_hw_ctx'
> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'tile_r_info' not described in 'hantro_vp9_dec_hw_ctx'
> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'tile_c_info' not described in 'hantro_vp9_dec_hw_ctx'
> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'last_tile_r' not described in 'hantro_vp9_dec_hw_ctx'
> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'last_tile_c' not described in 'hantro_vp9_dec_hw_ctx'
> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'last_sbs_r' not described in 'hantro_vp9_dec_hw_ctx'
> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'last_sbs_c' not described in 'hantro_vp9_dec_hw_ctx'
> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'active_segment' not described in 'hantro_vp9_dec_hw_ctx'
> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'feature_enabled' not described in 'hantro_vp9_dec_hw_ctx'
> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'feature_data' not described in 'hantro_vp9_dec_hw_ctx'
> 
> You can test kerneldoc yourself with: scripts/kernel-doc -none include/media/v4l2-vp9.h
> 
> Regards,
> 
> 	Hans
> 
>>
>> Regards,
>>
>> Andrzej
>>
>>
>>>
>>> Regards,
>>>
>>> 	Hans
>>>
>>> On 29/09/2021 18:04, Andrzej Pietrasiewicz wrote:
>>>> Dear all,
>>>>
>>>> This patch series adds VP9 codec V4L2 control interface and two drivers
>>>> using the new controls. It is a follow-up of previous v6 series [1].
>>>>
>>>> In this iteration, we've implemented VP9 hardware decoding on two devices:
>>>> Rockchip VDEC and Hantro G2, and tested on RK3399, i.MX8MQ and i.MX8MP.
>>>> The i.MX8M driver needs proper power domains support, though, which is a
>>>> subject of a different effort, but in all 3 cases we were able to run the
>>>> drivers.
>>>>
>>>> GStreamer support is also available, the needed changes have been submitted
>>>> by Daniel Almeida [2]. This MR is ready to be merged, and just needs the
>>>> VP9 V4L2 controls to be merged and released.
>>>>
>>>> Both rkvdec and hantro drivers are passing a significant number of VP9 tests
>>>> using Fluster[3]. There are still a few tests that are not passing, due to
>>>> dynamic frame resize (not yet supported by V4L2) and small size videos
>>>> (due to IP block limitations).
>>>>
>>>> The series adds the VP9 codec V4L2 control API as uAPI, so it aims at being
>>>> merged without passing through staging, as agreed[4]. The ABI has been checked
>>>> for padding and verified to contain no holes.
>>>>
>>>> [1] https://patchwork.linuxtv.org/project/linux-media/list/?series=6377
>>>> [2] https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2144
>>>> [3] https://github.com/fluendo/fluster
>>>> [4] https://lore.kernel.org/linux-media/b8f83c93-67fd-09f5-9314-15746cbfdc61@xs4all.nl/
>>>>
>>>> The series depends on the YUV tiled format support prepared by Ezequiel:
>>>> https://www.spinics.net/lists/linux-media/msg197047.html
>>>>
>>>> Rebased onto latest media_tree.
>>>>
>>>> Changes related to v6:
>>>> - moved setting tile filter and tile bsd auxiliary buffer addresses so
>>>> that they are always set, even if no tiles are used (thanks, Jernej)
>>>> - added a comment near the place where the 32-bit DMA mask is applied
>>>>     (thanks, Nicolas)
>>>> - improved consistency in register names (thanks, Nicolas)
>>>>
>>>> Changes related to v5:
>>>> - improved the doc comments as per Ezequiel's review (thanks, Ezequiel)
>>>> - improved pdf output of documentation
>>>> - added Benjamin's Reviewed-by (thanks, Benjamin)
>>>>
>>>> Changes related to v4:
>>>> - removed unused enum v4l2_vp9_intra_prediction_mode
>>>> - converted remaining enums to defines to follow the convention
>>>> - improved the documentation, in particular better documented how to use segmentation
>>>> features
>>>>
>>>> Changes related to v3:
>>>>
>>>> Apply suggestions from Jernej's review (thanks, Jernej):
>>>> - renamed a control and two structs:
>>>> 	V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR_PROBS =>
>>>> 		V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR
>>>> 	v4l2_ctrl_vp9_compressed_hdr_probs =>
>>>> 		v4l2_ctrl_vp9_compressed_hdr
>>>> 	v4l2_vp9_mv_compressed_hdr_probs => v4l2_vp9_mv_probs
>>>> - moved tx_mode to v4l2_ctrl_vp9_compressed_hdr
>>>> - fixed enum v4l2_vp9_ref_frame_sign_bias values (which are used to test a bitfield)
>>>> - explicitly assigned values to all other vp9 enums
>>>>
>>>> Apply suggestion from Nicolas's review (thanks, Nicolas):
>>>> - explicitly stated that the v4l2_ctrl_vp9_compressed_hdr control is optional
>>>> and implemented only by drivers which need it
>>>>
>>>> Changes related to the RFC v2:
>>>>
>>>> - added another driver including a postprocessor to de-tile
>>>>           codec-specific tiling
>>>> - reworked uAPI structs layout to follow VP8 style
>>>> - changed validation of loop filter params
>>>> - changed validation of segmentation params
>>>> - changed validation of VP9 frame params
>>>> - removed level lookup array from loop filter struct
>>>>           (can be computed by drivers)
>>>> - renamed some enum values to match the spec more closely
>>>> - V4L2 VP9 library changed the 'eob' member of
>>>>           'struct v4l2_vp9_frame_symbol_counts' so that it is an array
>>>>           of pointers instead of an array of pointers to arrays
>>>>           (IPs such as g2 creatively pass parts of the 'eob' counts in
>>>>           the 'coeff' counts)
>>>> - factored out several repeated portions of code
>>>> - minor nitpicks and cleanups
>>>>
>>>> Andrzej Pietrasiewicz (6):
>>>>     media: uapi: Add VP9 stateless decoder controls
>>>>     media: Add VP9 v4l2 library
>>>>     media: hantro: Rename registers
>>>>     media: hantro: Prepare for other G2 codecs
>>>>     media: hantro: Support VP9 on the G2 core
>>>>     media: hantro: Support NV12 on the G2 core
>>>>
>>>> Boris Brezillon (1):
>>>>     media: rkvdec: Add the VP9 backend
>>>>
>>>> Ezequiel Garcia (4):
>>>>     hantro: postproc: Fix motion vector space size
>>>>     hantro: postproc: Introduce struct hantro_postproc_ops
>>>>     hantro: Simplify postprocessor
>>>>     hantro: Add quirk for NV12/NV12_4L4 capture format
>>>>
>>>>    .../userspace-api/media/v4l/biblio.rst        |   10 +
>>>>    .../media/v4l/ext-ctrls-codec-stateless.rst   |  573 +++++
>>>>    .../media/v4l/pixfmt-compressed.rst           |   15 +
>>>>    .../media/v4l/vidioc-g-ext-ctrls.rst          |    8 +
>>>>    .../media/v4l/vidioc-queryctrl.rst            |   12 +
>>>>    .../media/videodev2.h.rst.exceptions          |    2 +
>>>>    drivers/media/v4l2-core/Kconfig               |    4 +
>>>>    drivers/media/v4l2-core/Makefile              |    1 +
>>>>    drivers/media/v4l2-core/v4l2-ctrls-core.c     |  180 ++
>>>>    drivers/media/v4l2-core/v4l2-ctrls-defs.c     |    8 +
>>>>    drivers/media/v4l2-core/v4l2-ioctl.c          |    1 +
>>>>    drivers/media/v4l2-core/v4l2-vp9.c            | 1850 +++++++++++++++++
>>>>    drivers/staging/media/hantro/Kconfig          |    1 +
>>>>    drivers/staging/media/hantro/Makefile         |    7 +-
>>>>    drivers/staging/media/hantro/hantro.h         |   40 +-
>>>>    drivers/staging/media/hantro/hantro_drv.c     |   23 +-
>>>>    drivers/staging/media/hantro/hantro_g2.c      |   27 +
>>>>    .../staging/media/hantro/hantro_g2_hevc_dec.c |   69 +-
>>>>    drivers/staging/media/hantro/hantro_g2_regs.h |  132 +-
>>>>    .../staging/media/hantro/hantro_g2_vp9_dec.c  |  980 +++++++++
>>>>    drivers/staging/media/hantro/hantro_hw.h      |   83 +-
>>>>    .../staging/media/hantro/hantro_postproc.c    |   79 +-
>>>>    drivers/staging/media/hantro/hantro_v4l2.c    |   20 +
>>>>    drivers/staging/media/hantro/hantro_vp9.c     |  240 +++
>>>>    drivers/staging/media/hantro/hantro_vp9.h     |  103 +
>>>>    drivers/staging/media/hantro/imx8m_vpu_hw.c   |   38 +-
>>>>    .../staging/media/hantro/rockchip_vpu_hw.c    |    7 +-
>>>>    .../staging/media/hantro/sama5d4_vdec_hw.c    |    3 +-
>>>>    drivers/staging/media/rkvdec/Kconfig          |    1 +
>>>>    drivers/staging/media/rkvdec/Makefile         |    2 +-
>>>>    drivers/staging/media/rkvdec/rkvdec-vp9.c     | 1078 ++++++++++
>>>>    drivers/staging/media/rkvdec/rkvdec.c         |   52 +-
>>>>    drivers/staging/media/rkvdec/rkvdec.h         |   12 +-
>>>>    include/media/v4l2-ctrls.h                    |    4 +
>>>>    include/media/v4l2-vp9.h                      |  182 ++
>>>>    include/uapi/linux/v4l2-controls.h            |  284 +++
>>>>    include/uapi/linux/videodev2.h                |    6 +
>>>>    37 files changed, 6033 insertions(+), 104 deletions(-)
>>>>    create mode 100644 drivers/media/v4l2-core/v4l2-vp9.c
>>>>    create mode 100644 drivers/staging/media/hantro/hantro_g2.c
>>>>    create mode 100644 drivers/staging/media/hantro/hantro_g2_vp9_dec.c
>>>>    create mode 100644 drivers/staging/media/hantro/hantro_vp9.c
>>>>    create mode 100644 drivers/staging/media/hantro/hantro_vp9.h
>>>>    create mode 100644 drivers/staging/media/rkvdec/rkvdec-vp9.c
>>>>    create mode 100644 include/media/v4l2-vp9.h
>>>>
>>>>
>>>> base-commit: e4e737bb5c170df6135a127739a9e6148ee3da82
>>>>
>>>
>>
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 00/11] VP9 codec V4L2 control interface
  2021-11-16  8:09       ` Andrzej Pietrasiewicz
@ 2021-11-16  8:21         ` Hans Verkuil
  2021-11-16 13:14           ` Andrzej Pietrasiewicz
  0 siblings, 1 reply; 37+ messages in thread
From: Hans Verkuil @ 2021-11-16  8:21 UTC (permalink / raw)
  To: Andrzej Pietrasiewicz, linux-media, linux-arm-kernel,
	linux-kernel, linux-rockchip, linux-staging
  Cc: Benjamin Gaignard, Boris Brezillon, Ezequiel Garcia,
	Fabio Estevam, Greg Kroah-Hartman, Heiko Stuebner,
	Jernej Skrabec, Mauro Carvalho Chehab, Nicolas Dufresne,
	NXP Linux Team, Pengutronix Kernel Team, Philipp Zabel,
	Sascha Hauer, Shawn Guo, kernel

On 16/11/2021 09:09, Andrzej Pietrasiewicz wrote:
> Hi Hans,
> 
> W dniu 15.11.2021 o 22:16, Hans Verkuil pisze:
>> On 15/11/2021 18:14, Andrzej Pietrasiewicz wrote:
>>> Hi Hans,
>>>
>>> W dniu 15.11.2021 o 16:07, Hans Verkuil pisze:
>>>> Andrzej,
>>>>
>>>> Can you rebase this series on top of the master branch of
>>>> https://git.linuxtv.org/media_stage.git/ ? Unfortunately this v7 no longer
>>>> applies. Specifically "rkvdec: Add the VP9 backend" failed in a non-trivial
>>>> manner.
>>>
>>> This is a branch for you:
>>>
>>> https://gitlab.collabora.com/linux/for-upstream/-/tree/vp9-uapi
>>
>> I'm getting a bunch of sparse/smatch warnings:
>>
> 
> Thanks for finding this, I will re-create the branch and let you know on irc.
> Some of the below are "false positives, namely:
> 
> drivers/media/platform/omap3isp/omap3isp.h
> drivers/media/platform/qcom/venus/core.h

Ah, sorry, I though I had filtered those out. Obviously you can ignore those.

Please post a v8. That way the series is archived on lore. And it works better
with patchwork.

Regards,

	Hans

> 
> which are not touched by the series.
> 
> Regards,
> 
> Andrzej
> 
>> sparse:
>> rkvdec/rkvdec-vp9.c:190:43: warning: variable 'dec_params' set but not used [-Wunused-but-set-variable]
>> rkvdec/rkvdec-vp9.c:245:43: warning: variable 'dec_params' set but not used [-Wunused-but-set-variable]
>> SPARSE:hantro/hantro_postproc.c hantro/hantro_postproc.c:37:35: warning: symbol 'hantro_g1_postproc_regs' was not declared. Should it be static?
>>
>> smatch:
>> rkvdec/rkvdec-vp9.c:190:43: warning: variable 'dec_params' set but not used [-Wunused-but-set-variable]
>> rkvdec/rkvdec-vp9.c:245:43: warning: variable 'dec_params' set but not used [-Wunused-but-set-variable]
>> rkvdec/rkvdec-vp9.c: rkvdec/rkvdec-vp9.c:236 init_intra_only_probs() error: buffer overflow 'ptr' 90 <= 91
>> hantro/hantro_g2_vp9_dec.c: hantro/hantro_g2_vp9_dec.c:670 config_probs() error: memcpy() 'adaptive->inter_mode[i]' too small (4 vs 21)
>> hantro/hantro_g2_vp9_dec.c: hantro/hantro_g2_vp9_dec.c:670 config_probs() error: memcpy() 'probs->inter_mode[i]' too small (3 vs 21
>>
>> Also a bunch of kerneldoc warnings:
>>
>> include/media/v4l2-vp9.h:30: warning: Function parameter or member 'joint' not described in 'v4l2_vp9_frame_mv_context'
>> include/media/v4l2-vp9.h:30: warning: Function parameter or member 'sign' not described in 'v4l2_vp9_frame_mv_context'
>> include/media/v4l2-vp9.h:30: warning: Function parameter or member 'classes' not described in 'v4l2_vp9_frame_mv_context'
>> include/media/v4l2-vp9.h:30: warning: Function parameter or member 'class0_bit' not described in 'v4l2_vp9_frame_mv_context'
>> include/media/v4l2-vp9.h:30: warning: Function parameter or member 'bits' not described in 'v4l2_vp9_frame_mv_context'
>> include/media/v4l2-vp9.h:30: warning: Function parameter or member 'class0_fr' not described in 'v4l2_vp9_frame_mv_context'
>> include/media/v4l2-vp9.h:30: warning: Function parameter or member 'fr' not described in 'v4l2_vp9_frame_mv_context'
>> include/media/v4l2-vp9.h:30: warning: Function parameter or member 'class0_hp' not described in 'v4l2_vp9_frame_mv_context'
>> include/media/v4l2-vp9.h:30: warning: Function parameter or member 'hp' not described in 'v4l2_vp9_frame_mv_context'
>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'tx8' not described in 'v4l2_vp9_frame_context'
>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'tx16' not described in 'v4l2_vp9_frame_context'
>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'tx32' not described in 'v4l2_vp9_frame_context'
>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'coef' not described in 'v4l2_vp9_frame_context'
>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'skip' not described in 'v4l2_vp9_frame_context'
>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'inter_mode' not described in 'v4l2_vp9_frame_context'
>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'interp_filter' not described in 'v4l2_vp9_frame_context'
>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'is_inter' not described in 'v4l2_vp9_frame_context'
>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'comp_mode' not described in 'v4l2_vp9_frame_context'
>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'single_ref' not described in 'v4l2_vp9_frame_context'
>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'comp_ref' not described in 'v4l2_vp9_frame_context'
>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'y_mode' not described in 'v4l2_vp9_frame_context'
>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'uv_mode' not described in 'v4l2_vp9_frame_context'
>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'partition' not described in 'v4l2_vp9_frame_context'
>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'mv' not described in 'v4l2_vp9_frame_context'
>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'partition' not described in 'v4l2_vp9_frame_symbol_counts'
>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'skip' not described in 'v4l2_vp9_frame_symbol_counts'
>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'intra_inter' not described in 'v4l2_vp9_frame_symbol_counts'
>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'tx32p' not described in 'v4l2_vp9_frame_symbol_counts'
>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'tx16p' not described in 'v4l2_vp9_frame_symbol_counts'
>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'tx8p' not described in 'v4l2_vp9_frame_symbol_counts'
>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'y_mode' not described in 'v4l2_vp9_frame_symbol_counts'
>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'uv_mode' not described in 'v4l2_vp9_frame_symbol_counts'
>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'comp' not described in 'v4l2_vp9_frame_symbol_counts'
>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'comp_ref' not described in 'v4l2_vp9_frame_symbol_counts'
>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'single_ref' not described in 'v4l2_vp9_frame_symbol_counts'
>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'mv_mode' not described in 'v4l2_vp9_frame_symbol_counts'
>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'filter' not described in 'v4l2_vp9_frame_symbol_counts'
>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'mv_joint' not described in 'v4l2_vp9_frame_symbol_counts'
>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'sign' not described in 'v4l2_vp9_frame_symbol_counts'
>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'classes' not described in 'v4l2_vp9_frame_symbol_counts'
>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'class0' not described in 'v4l2_vp9_frame_symbol_counts'
>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'bits' not described in 'v4l2_vp9_frame_symbol_counts'
>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'class0_fp' not described in 'v4l2_vp9_frame_symbol_counts'
>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'fp' not described in 'v4l2_vp9_frame_symbol_counts'
>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'class0_hp' not described in 'v4l2_vp9_frame_symbol_counts'
>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'hp' not described in 'v4l2_vp9_frame_symbol_counts'
>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'coeff' not described in 'v4l2_vp9_frame_symbol_counts'
>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'eob' not described in 'v4l2_vp9_frame_symbol_counts'
>> include/media/v4l2-vp9.h:166: warning: expecting prototype for v4l2_vp9_adapt_coef_probs(). Prototype was for v4l2_vp9_adapt_noncoef_probs()
>> instead
>> drivers/media/platform/omap3isp/omap3isp.h:107: warning: Function parameter or member 'vp_clk_pol' not described in 'isp_ccp2_cfg'
>> drivers/media/platform/omap3isp/omap3isp.h:107: warning: Function parameter or member 'lanecfg' not described in 'isp_ccp2_cfg'
>> drivers/media/platform/qcom/venus/core.h:202: warning: Function parameter or member 'sys_err_done' not described in 'venus_core'
>> drivers/media/platform/qcom/venus/core.h:462: warning: Function parameter or member 'fw_min_cnt' not described in 'venus_inst'
>> drivers/media/platform/qcom/venus/core.h:462: warning: Function parameter or member 'flags' not described in 'venus_inst'
>> drivers/media/platform/qcom/venus/core.h:462: warning: Function parameter or member 'dpb_ids' not described in 'venus_inst'
>> drivers/staging/media/hantro/hantro.h:115: warning: Enum value 'HANTRO_MODE_VP9_DEC' not described in enum 'hantro_codec_mode'
>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'tile_edge' not described in 'hantro_vp9_dec_hw_ctx'
>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'segment_map' not described in 'hantro_vp9_dec_hw_ctx'
>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'misc' not described in 'hantro_vp9_dec_hw_ctx'
>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'cnts' not described in 'hantro_vp9_dec_hw_ctx'
>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'probability_tables' not described in
>> 'hantro_vp9_dec_hw_ctx'
>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'frame_context' not described in 'hantro_vp9_dec_hw_ctx'
>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'cur' not described in 'hantro_vp9_dec_hw_ctx'
>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'last' not described in 'hantro_vp9_dec_hw_ctx'
>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'bsd_ctrl_offset' not described in 'hantro_vp9_dec_hw_ctx'
>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'segment_map_size' not described in 'hantro_vp9_dec_hw_ctx'
>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'ctx_counters_offset' not described in
>> 'hantro_vp9_dec_hw_ctx'
>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'tile_info_offset' not described in 'hantro_vp9_dec_hw_ctx'
>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'tile_r_info' not described in 'hantro_vp9_dec_hw_ctx'
>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'tile_c_info' not described in 'hantro_vp9_dec_hw_ctx'
>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'last_tile_r' not described in 'hantro_vp9_dec_hw_ctx'
>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'last_tile_c' not described in 'hantro_vp9_dec_hw_ctx'
>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'last_sbs_r' not described in 'hantro_vp9_dec_hw_ctx'
>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'last_sbs_c' not described in 'hantro_vp9_dec_hw_ctx'
>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'active_segment' not described in 'hantro_vp9_dec_hw_ctx'
>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'feature_enabled' not described in 'hantro_vp9_dec_hw_ctx'
>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'feature_data' not described in 'hantro_vp9_dec_hw_ctx'
>>
>> You can test kerneldoc yourself with: scripts/kernel-doc -none include/media/v4l2-vp9.h
>>
>> Regards,
>>
>> 	Hans
>>
>>>
>>> Regards,
>>>
>>> Andrzej
>>>
>>>
>>>>
>>>> Regards,
>>>>
>>>> 	Hans
>>>>
>>>> On 29/09/2021 18:04, Andrzej Pietrasiewicz wrote:
>>>>> Dear all,
>>>>>
>>>>> This patch series adds VP9 codec V4L2 control interface and two drivers
>>>>> using the new controls. It is a follow-up of previous v6 series [1].
>>>>>
>>>>> In this iteration, we've implemented VP9 hardware decoding on two devices:
>>>>> Rockchip VDEC and Hantro G2, and tested on RK3399, i.MX8MQ and i.MX8MP.
>>>>> The i.MX8M driver needs proper power domains support, though, which is a
>>>>> subject of a different effort, but in all 3 cases we were able to run the
>>>>> drivers.
>>>>>
>>>>> GStreamer support is also available, the needed changes have been submitted
>>>>> by Daniel Almeida [2]. This MR is ready to be merged, and just needs the
>>>>> VP9 V4L2 controls to be merged and released.
>>>>>
>>>>> Both rkvdec and hantro drivers are passing a significant number of VP9 tests
>>>>> using Fluster[3]. There are still a few tests that are not passing, due to
>>>>> dynamic frame resize (not yet supported by V4L2) and small size videos
>>>>> (due to IP block limitations).
>>>>>
>>>>> The series adds the VP9 codec V4L2 control API as uAPI, so it aims at being
>>>>> merged without passing through staging, as agreed[4]. The ABI has been checked
>>>>> for padding and verified to contain no holes.
>>>>>
>>>>> [1] https://patchwork.linuxtv.org/project/linux-media/list/?series=6377
>>>>> [2] https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2144
>>>>> [3] https://github.com/fluendo/fluster
>>>>> [4] https://lore.kernel.org/linux-media/b8f83c93-67fd-09f5-9314-15746cbfdc61@xs4all.nl/
>>>>>
>>>>> The series depends on the YUV tiled format support prepared by Ezequiel:
>>>>> https://www.spinics.net/lists/linux-media/msg197047.html
>>>>>
>>>>> Rebased onto latest media_tree.
>>>>>
>>>>> Changes related to v6:
>>>>> - moved setting tile filter and tile bsd auxiliary buffer addresses so
>>>>> that they are always set, even if no tiles are used (thanks, Jernej)
>>>>> - added a comment near the place where the 32-bit DMA mask is applied
>>>>>     (thanks, Nicolas)
>>>>> - improved consistency in register names (thanks, Nicolas)
>>>>>
>>>>> Changes related to v5:
>>>>> - improved the doc comments as per Ezequiel's review (thanks, Ezequiel)
>>>>> - improved pdf output of documentation
>>>>> - added Benjamin's Reviewed-by (thanks, Benjamin)
>>>>>
>>>>> Changes related to v4:
>>>>> - removed unused enum v4l2_vp9_intra_prediction_mode
>>>>> - converted remaining enums to defines to follow the convention
>>>>> - improved the documentation, in particular better documented how to use segmentation
>>>>> features
>>>>>
>>>>> Changes related to v3:
>>>>>
>>>>> Apply suggestions from Jernej's review (thanks, Jernej):
>>>>> - renamed a control and two structs:
>>>>> 	V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR_PROBS =>
>>>>> 		V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR
>>>>> 	v4l2_ctrl_vp9_compressed_hdr_probs =>
>>>>> 		v4l2_ctrl_vp9_compressed_hdr
>>>>> 	v4l2_vp9_mv_compressed_hdr_probs => v4l2_vp9_mv_probs
>>>>> - moved tx_mode to v4l2_ctrl_vp9_compressed_hdr
>>>>> - fixed enum v4l2_vp9_ref_frame_sign_bias values (which are used to test a bitfield)
>>>>> - explicitly assigned values to all other vp9 enums
>>>>>
>>>>> Apply suggestion from Nicolas's review (thanks, Nicolas):
>>>>> - explicitly stated that the v4l2_ctrl_vp9_compressed_hdr control is optional
>>>>> and implemented only by drivers which need it
>>>>>
>>>>> Changes related to the RFC v2:
>>>>>
>>>>> - added another driver including a postprocessor to de-tile
>>>>>           codec-specific tiling
>>>>> - reworked uAPI structs layout to follow VP8 style
>>>>> - changed validation of loop filter params
>>>>> - changed validation of segmentation params
>>>>> - changed validation of VP9 frame params
>>>>> - removed level lookup array from loop filter struct
>>>>>           (can be computed by drivers)
>>>>> - renamed some enum values to match the spec more closely
>>>>> - V4L2 VP9 library changed the 'eob' member of
>>>>>           'struct v4l2_vp9_frame_symbol_counts' so that it is an array
>>>>>           of pointers instead of an array of pointers to arrays
>>>>>           (IPs such as g2 creatively pass parts of the 'eob' counts in
>>>>>           the 'coeff' counts)
>>>>> - factored out several repeated portions of code
>>>>> - minor nitpicks and cleanups
>>>>>
>>>>> Andrzej Pietrasiewicz (6):
>>>>>     media: uapi: Add VP9 stateless decoder controls
>>>>>     media: Add VP9 v4l2 library
>>>>>     media: hantro: Rename registers
>>>>>     media: hantro: Prepare for other G2 codecs
>>>>>     media: hantro: Support VP9 on the G2 core
>>>>>     media: hantro: Support NV12 on the G2 core
>>>>>
>>>>> Boris Brezillon (1):
>>>>>     media: rkvdec: Add the VP9 backend
>>>>>
>>>>> Ezequiel Garcia (4):
>>>>>     hantro: postproc: Fix motion vector space size
>>>>>     hantro: postproc: Introduce struct hantro_postproc_ops
>>>>>     hantro: Simplify postprocessor
>>>>>     hantro: Add quirk for NV12/NV12_4L4 capture format
>>>>>
>>>>>    .../userspace-api/media/v4l/biblio.rst        |   10 +
>>>>>    .../media/v4l/ext-ctrls-codec-stateless.rst   |  573 +++++
>>>>>    .../media/v4l/pixfmt-compressed.rst           |   15 +
>>>>>    .../media/v4l/vidioc-g-ext-ctrls.rst          |    8 +
>>>>>    .../media/v4l/vidioc-queryctrl.rst            |   12 +
>>>>>    .../media/videodev2.h.rst.exceptions          |    2 +
>>>>>    drivers/media/v4l2-core/Kconfig               |    4 +
>>>>>    drivers/media/v4l2-core/Makefile              |    1 +
>>>>>    drivers/media/v4l2-core/v4l2-ctrls-core.c     |  180 ++
>>>>>    drivers/media/v4l2-core/v4l2-ctrls-defs.c     |    8 +
>>>>>    drivers/media/v4l2-core/v4l2-ioctl.c          |    1 +
>>>>>    drivers/media/v4l2-core/v4l2-vp9.c            | 1850 +++++++++++++++++
>>>>>    drivers/staging/media/hantro/Kconfig          |    1 +
>>>>>    drivers/staging/media/hantro/Makefile         |    7 +-
>>>>>    drivers/staging/media/hantro/hantro.h         |   40 +-
>>>>>    drivers/staging/media/hantro/hantro_drv.c     |   23 +-
>>>>>    drivers/staging/media/hantro/hantro_g2.c      |   27 +
>>>>>    .../staging/media/hantro/hantro_g2_hevc_dec.c |   69 +-
>>>>>    drivers/staging/media/hantro/hantro_g2_regs.h |  132 +-
>>>>>    .../staging/media/hantro/hantro_g2_vp9_dec.c  |  980 +++++++++
>>>>>    drivers/staging/media/hantro/hantro_hw.h      |   83 +-
>>>>>    .../staging/media/hantro/hantro_postproc.c    |   79 +-
>>>>>    drivers/staging/media/hantro/hantro_v4l2.c    |   20 +
>>>>>    drivers/staging/media/hantro/hantro_vp9.c     |  240 +++
>>>>>    drivers/staging/media/hantro/hantro_vp9.h     |  103 +
>>>>>    drivers/staging/media/hantro/imx8m_vpu_hw.c   |   38 +-
>>>>>    .../staging/media/hantro/rockchip_vpu_hw.c    |    7 +-
>>>>>    .../staging/media/hantro/sama5d4_vdec_hw.c    |    3 +-
>>>>>    drivers/staging/media/rkvdec/Kconfig          |    1 +
>>>>>    drivers/staging/media/rkvdec/Makefile         |    2 +-
>>>>>    drivers/staging/media/rkvdec/rkvdec-vp9.c     | 1078 ++++++++++
>>>>>    drivers/staging/media/rkvdec/rkvdec.c         |   52 +-
>>>>>    drivers/staging/media/rkvdec/rkvdec.h         |   12 +-
>>>>>    include/media/v4l2-ctrls.h                    |    4 +
>>>>>    include/media/v4l2-vp9.h                      |  182 ++
>>>>>    include/uapi/linux/v4l2-controls.h            |  284 +++
>>>>>    include/uapi/linux/videodev2.h                |    6 +
>>>>>    37 files changed, 6033 insertions(+), 104 deletions(-)
>>>>>    create mode 100644 drivers/media/v4l2-core/v4l2-vp9.c
>>>>>    create mode 100644 drivers/staging/media/hantro/hantro_g2.c
>>>>>    create mode 100644 drivers/staging/media/hantro/hantro_g2_vp9_dec.c
>>>>>    create mode 100644 drivers/staging/media/hantro/hantro_vp9.c
>>>>>    create mode 100644 drivers/staging/media/hantro/hantro_vp9.h
>>>>>    create mode 100644 drivers/staging/media/rkvdec/rkvdec-vp9.c
>>>>>    create mode 100644 include/media/v4l2-vp9.h
>>>>>
>>>>>
>>>>> base-commit: e4e737bb5c170df6135a127739a9e6148ee3da82
>>>>>
>>>>
>>>
>>
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 00/11] VP9 codec V4L2 control interface
  2021-11-16  8:21         ` Hans Verkuil
@ 2021-11-16 13:14           ` Andrzej Pietrasiewicz
  2021-11-17  9:59             ` Hans Verkuil
  0 siblings, 1 reply; 37+ messages in thread
From: Andrzej Pietrasiewicz @ 2021-11-16 13:14 UTC (permalink / raw)
  To: Hans Verkuil, linux-media, linux-arm-kernel, linux-kernel,
	linux-rockchip, linux-staging
  Cc: Benjamin Gaignard, Boris Brezillon, Ezequiel Garcia,
	Fabio Estevam, Greg Kroah-Hartman, Heiko Stuebner,
	Jernej Skrabec, Mauro Carvalho Chehab, Nicolas Dufresne,
	NXP Linux Team, Pengutronix Kernel Team, Philipp Zabel,
	Sascha Hauer, Shawn Guo, kernel

Hi,

W dniu 16.11.2021 o 09:21, Hans Verkuil pisze:
> On 16/11/2021 09:09, Andrzej Pietrasiewicz wrote:
>> Hi Hans,
>>
>> W dniu 15.11.2021 o 22:16, Hans Verkuil pisze:
>>> On 15/11/2021 18:14, Andrzej Pietrasiewicz wrote:
>>>> Hi Hans,
>>>>
>>>> W dniu 15.11.2021 o 16:07, Hans Verkuil pisze:
>>>>> Andrzej,
>>>>>
>>>>> Can you rebase this series on top of the master branch of
>>>>> https://git.linuxtv.org/media_stage.git/ ? Unfortunately this v7 no longer
>>>>> applies. Specifically "rkvdec: Add the VP9 backend" failed in a non-trivial
>>>>> manner.
>>>>
>>>> This is a branch for you:
>>>>
>>>> https://gitlab.collabora.com/linux/for-upstream/-/tree/vp9-uapi
>>>
>>> I'm getting a bunch of sparse/smatch warnings:
>>>
>>
>> Thanks for finding this, I will re-create the branch and let you know on irc.
>> Some of the below are "false positives, namely:
>>
>> drivers/media/platform/omap3isp/omap3isp.h
>> drivers/media/platform/qcom/venus/core.h
> 
> Ah, sorry, I though I had filtered those out. Obviously you can ignore those.
> 
> Please post a v8. That way the series is archived on lore. And it works better
> with patchwork.

Sure, no problem. Also please see below.

> 
> Regards,
> 
> 	Hans
> 
>>
>> which are not touched by the series.
>>
>> Regards,
>>
>> Andrzej
>>
>>> sparse:
>>> rkvdec/rkvdec-vp9.c:190:43: warning: variable 'dec_params' set but not used [-Wunused-but-set-variable]
>>> rkvdec/rkvdec-vp9.c:245:43: warning: variable 'dec_params' set but not used [-Wunused-but-set-variable]
>>> SPARSE:hantro/hantro_postproc.c hantro/hantro_postproc.c:37:35: warning: symbol 'hantro_g1_postproc_regs' was not declared. Should it be static?
>>>
>>> smatch:
>>> rkvdec/rkvdec-vp9.c:190:43: warning: variable 'dec_params' set but not used [-Wunused-but-set-variable]
>>> rkvdec/rkvdec-vp9.c:245:43: warning: variable 'dec_params' set but not used [-Wunused-but-set-variable]
>>> rkvdec/rkvdec-vp9.c: rkvdec/rkvdec-vp9.c:236 init_intra_only_probs() error: buffer overflow 'ptr' 90 <= 91

this looks a false positive.

A portion of memory pointed to by ptr is indexed with i * 23 + m,
where i ranges from 0 to 3, inclusive, and m ranges from 0 to 22,
inclusive if i < 3, otherwise m ranges from 0 to 20, inclusive.
So the largest index value we compute equals 89 (3 * 23 + 20).
Because ptr points to something that is at least 90 bytes large,
89 is a valid index and no greater index will be ever computed.

>>> hantro/hantro_g2_vp9_dec.c: hantro/hantro_g2_vp9_dec.c:670 config_probs() error: memcpy() 'adaptive->inter_mode[i]' too small (4 vs 21)
>>> hantro/hantro_g2_vp9_dec.c: hantro/hantro_g2_vp9_dec.c:670 config_probs() error: memcpy() 'probs->inter_mode[i]' too small (3 vs 21
>>>
>>> Also a bunch of kerneldoc warnings:
>>>
>>> include/media/v4l2-vp9.h:30: warning: Function parameter or member 'joint' not described in 'v4l2_vp9_frame_mv_context'
>>> include/media/v4l2-vp9.h:30: warning: Function parameter or member 'sign' not described in 'v4l2_vp9_frame_mv_context'
>>> include/media/v4l2-vp9.h:30: warning: Function parameter or member 'classes' not described in 'v4l2_vp9_frame_mv_context'
>>> include/media/v4l2-vp9.h:30: warning: Function parameter or member 'class0_bit' not described in 'v4l2_vp9_frame_mv_context'
>>> include/media/v4l2-vp9.h:30: warning: Function parameter or member 'bits' not described in 'v4l2_vp9_frame_mv_context'
>>> include/media/v4l2-vp9.h:30: warning: Function parameter or member 'class0_fr' not described in 'v4l2_vp9_frame_mv_context'
>>> include/media/v4l2-vp9.h:30: warning: Function parameter or member 'fr' not described in 'v4l2_vp9_frame_mv_context'
>>> include/media/v4l2-vp9.h:30: warning: Function parameter or member 'class0_hp' not described in 'v4l2_vp9_frame_mv_context'
>>> include/media/v4l2-vp9.h:30: warning: Function parameter or member 'hp' not described in 'v4l2_vp9_frame_mv_context'
>>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'tx8' not described in 'v4l2_vp9_frame_context'
>>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'tx16' not described in 'v4l2_vp9_frame_context'
>>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'tx32' not described in 'v4l2_vp9_frame_context'
>>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'coef' not described in 'v4l2_vp9_frame_context'
>>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'skip' not described in 'v4l2_vp9_frame_context'
>>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'inter_mode' not described in 'v4l2_vp9_frame_context'
>>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'interp_filter' not described in 'v4l2_vp9_frame_context'
>>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'is_inter' not described in 'v4l2_vp9_frame_context'
>>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'comp_mode' not described in 'v4l2_vp9_frame_context'
>>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'single_ref' not described in 'v4l2_vp9_frame_context'
>>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'comp_ref' not described in 'v4l2_vp9_frame_context'
>>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'y_mode' not described in 'v4l2_vp9_frame_context'
>>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'uv_mode' not described in 'v4l2_vp9_frame_context'
>>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'partition' not described in 'v4l2_vp9_frame_context'
>>> include/media/v4l2-vp9.h:58: warning: Function parameter or member 'mv' not described in 'v4l2_vp9_frame_context'
>>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'partition' not described in 'v4l2_vp9_frame_symbol_counts'
>>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'skip' not described in 'v4l2_vp9_frame_symbol_counts'
>>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'intra_inter' not described in 'v4l2_vp9_frame_symbol_counts'
>>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'tx32p' not described in 'v4l2_vp9_frame_symbol_counts'
>>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'tx16p' not described in 'v4l2_vp9_frame_symbol_counts'
>>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'tx8p' not described in 'v4l2_vp9_frame_symbol_counts'
>>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'y_mode' not described in 'v4l2_vp9_frame_symbol_counts'
>>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'uv_mode' not described in 'v4l2_vp9_frame_symbol_counts'
>>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'comp' not described in 'v4l2_vp9_frame_symbol_counts'
>>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'comp_ref' not described in 'v4l2_vp9_frame_symbol_counts'
>>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'single_ref' not described in 'v4l2_vp9_frame_symbol_counts'
>>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'mv_mode' not described in 'v4l2_vp9_frame_symbol_counts'
>>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'filter' not described in 'v4l2_vp9_frame_symbol_counts'
>>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'mv_joint' not described in 'v4l2_vp9_frame_symbol_counts'
>>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'sign' not described in 'v4l2_vp9_frame_symbol_counts'
>>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'classes' not described in 'v4l2_vp9_frame_symbol_counts'
>>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'class0' not described in 'v4l2_vp9_frame_symbol_counts'
>>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'bits' not described in 'v4l2_vp9_frame_symbol_counts'
>>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'class0_fp' not described in 'v4l2_vp9_frame_symbol_counts'
>>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'fp' not described in 'v4l2_vp9_frame_symbol_counts'
>>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'class0_hp' not described in 'v4l2_vp9_frame_symbol_counts'
>>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'hp' not described in 'v4l2_vp9_frame_symbol_counts'
>>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'coeff' not described in 'v4l2_vp9_frame_symbol_counts'
>>> include/media/v4l2-vp9.h:93: warning: Function parameter or member 'eob' not described in 'v4l2_vp9_frame_symbol_counts'
>>> include/media/v4l2-vp9.h:166: warning: expecting prototype for v4l2_vp9_adapt_coef_probs(). Prototype was for v4l2_vp9_adapt_noncoef_probs()
>>> instead
>>> drivers/media/platform/omap3isp/omap3isp.h:107: warning: Function parameter or member 'vp_clk_pol' not described in 'isp_ccp2_cfg'
>>> drivers/media/platform/omap3isp/omap3isp.h:107: warning: Function parameter or member 'lanecfg' not described in 'isp_ccp2_cfg'
>>> drivers/media/platform/qcom/venus/core.h:202: warning: Function parameter or member 'sys_err_done' not described in 'venus_core'
>>> drivers/media/platform/qcom/venus/core.h:462: warning: Function parameter or member 'fw_min_cnt' not described in 'venus_inst'
>>> drivers/media/platform/qcom/venus/core.h:462: warning: Function parameter or member 'flags' not described in 'venus_inst'
>>> drivers/media/platform/qcom/venus/core.h:462: warning: Function parameter or member 'dpb_ids' not described in 'venus_inst'
>>> drivers/staging/media/hantro/hantro.h:115: warning: Enum value 'HANTRO_MODE_VP9_DEC' not described in enum 'hantro_codec_mode'
>>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'tile_edge' not described in 'hantro_vp9_dec_hw_ctx'
>>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'segment_map' not described in 'hantro_vp9_dec_hw_ctx'
>>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'misc' not described in 'hantro_vp9_dec_hw_ctx'
>>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'cnts' not described in 'hantro_vp9_dec_hw_ctx'
>>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'probability_tables' not described in
>>> 'hantro_vp9_dec_hw_ctx'
>>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'frame_context' not described in 'hantro_vp9_dec_hw_ctx'
>>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'cur' not described in 'hantro_vp9_dec_hw_ctx'
>>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'last' not described in 'hantro_vp9_dec_hw_ctx'
>>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'bsd_ctrl_offset' not described in 'hantro_vp9_dec_hw_ctx'
>>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'segment_map_size' not described in 'hantro_vp9_dec_hw_ctx'
>>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'ctx_counters_offset' not described in
>>> 'hantro_vp9_dec_hw_ctx'
>>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'tile_info_offset' not described in 'hantro_vp9_dec_hw_ctx'
>>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'tile_r_info' not described in 'hantro_vp9_dec_hw_ctx'
>>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'tile_c_info' not described in 'hantro_vp9_dec_hw_ctx'
>>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'last_tile_r' not described in 'hantro_vp9_dec_hw_ctx'
>>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'last_tile_c' not described in 'hantro_vp9_dec_hw_ctx'
>>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'last_sbs_r' not described in 'hantro_vp9_dec_hw_ctx'
>>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'last_sbs_c' not described in 'hantro_vp9_dec_hw_ctx'
>>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'active_segment' not described in 'hantro_vp9_dec_hw_ctx'
>>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'feature_enabled' not described in 'hantro_vp9_dec_hw_ctx'
>>> drivers/staging/media/hantro/hantro_hw.h:211: warning: Function parameter or member 'feature_data' not described in 'hantro_vp9_dec_hw_ctx'
>>>
>>> You can test kerneldoc yourself with: scripts/kernel-doc -none include/media/v4l2-vp9.h
>>>
>>> Regards,
>>>
>>> 	Hans
>>>
>>>>
>>>> Regards,
>>>>
>>>> Andrzej
>>>>
>>>>
>>>>>
>>>>> Regards,
>>>>>
>>>>> 	Hans
>>>>>
>>>>> On 29/09/2021 18:04, Andrzej Pietrasiewicz wrote:
>>>>>> Dear all,
>>>>>>
>>>>>> This patch series adds VP9 codec V4L2 control interface and two drivers
>>>>>> using the new controls. It is a follow-up of previous v6 series [1].
>>>>>>
>>>>>> In this iteration, we've implemented VP9 hardware decoding on two devices:
>>>>>> Rockchip VDEC and Hantro G2, and tested on RK3399, i.MX8MQ and i.MX8MP.
>>>>>> The i.MX8M driver needs proper power domains support, though, which is a
>>>>>> subject of a different effort, but in all 3 cases we were able to run the
>>>>>> drivers.
>>>>>>
>>>>>> GStreamer support is also available, the needed changes have been submitted
>>>>>> by Daniel Almeida [2]. This MR is ready to be merged, and just needs the
>>>>>> VP9 V4L2 controls to be merged and released.
>>>>>>
>>>>>> Both rkvdec and hantro drivers are passing a significant number of VP9 tests
>>>>>> using Fluster[3]. There are still a few tests that are not passing, due to
>>>>>> dynamic frame resize (not yet supported by V4L2) and small size videos
>>>>>> (due to IP block limitations).
>>>>>>
>>>>>> The series adds the VP9 codec V4L2 control API as uAPI, so it aims at being
>>>>>> merged without passing through staging, as agreed[4]. The ABI has been checked
>>>>>> for padding and verified to contain no holes.
>>>>>>
>>>>>> [1] https://patchwork.linuxtv.org/project/linux-media/list/?series=6377
>>>>>> [2] https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2144
>>>>>> [3] https://github.com/fluendo/fluster
>>>>>> [4] https://lore.kernel.org/linux-media/b8f83c93-67fd-09f5-9314-15746cbfdc61@xs4all.nl/
>>>>>>
>>>>>> The series depends on the YUV tiled format support prepared by Ezequiel:
>>>>>> https://www.spinics.net/lists/linux-media/msg197047.html
>>>>>>
>>>>>> Rebased onto latest media_tree.
>>>>>>
>>>>>> Changes related to v6:
>>>>>> - moved setting tile filter and tile bsd auxiliary buffer addresses so
>>>>>> that they are always set, even if no tiles are used (thanks, Jernej)
>>>>>> - added a comment near the place where the 32-bit DMA mask is applied
>>>>>>      (thanks, Nicolas)
>>>>>> - improved consistency in register names (thanks, Nicolas)
>>>>>>
>>>>>> Changes related to v5:
>>>>>> - improved the doc comments as per Ezequiel's review (thanks, Ezequiel)
>>>>>> - improved pdf output of documentation
>>>>>> - added Benjamin's Reviewed-by (thanks, Benjamin)
>>>>>>
>>>>>> Changes related to v4:
>>>>>> - removed unused enum v4l2_vp9_intra_prediction_mode
>>>>>> - converted remaining enums to defines to follow the convention
>>>>>> - improved the documentation, in particular better documented how to use segmentation
>>>>>> features
>>>>>>
>>>>>> Changes related to v3:
>>>>>>
>>>>>> Apply suggestions from Jernej's review (thanks, Jernej):
>>>>>> - renamed a control and two structs:
>>>>>> 	V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR_PROBS =>
>>>>>> 		V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR
>>>>>> 	v4l2_ctrl_vp9_compressed_hdr_probs =>
>>>>>> 		v4l2_ctrl_vp9_compressed_hdr
>>>>>> 	v4l2_vp9_mv_compressed_hdr_probs => v4l2_vp9_mv_probs
>>>>>> - moved tx_mode to v4l2_ctrl_vp9_compressed_hdr
>>>>>> - fixed enum v4l2_vp9_ref_frame_sign_bias values (which are used to test a bitfield)
>>>>>> - explicitly assigned values to all other vp9 enums
>>>>>>
>>>>>> Apply suggestion from Nicolas's review (thanks, Nicolas):
>>>>>> - explicitly stated that the v4l2_ctrl_vp9_compressed_hdr control is optional
>>>>>> and implemented only by drivers which need it
>>>>>>
>>>>>> Changes related to the RFC v2:
>>>>>>
>>>>>> - added another driver including a postprocessor to de-tile
>>>>>>            codec-specific tiling
>>>>>> - reworked uAPI structs layout to follow VP8 style
>>>>>> - changed validation of loop filter params
>>>>>> - changed validation of segmentation params
>>>>>> - changed validation of VP9 frame params
>>>>>> - removed level lookup array from loop filter struct
>>>>>>            (can be computed by drivers)
>>>>>> - renamed some enum values to match the spec more closely
>>>>>> - V4L2 VP9 library changed the 'eob' member of
>>>>>>            'struct v4l2_vp9_frame_symbol_counts' so that it is an array
>>>>>>            of pointers instead of an array of pointers to arrays
>>>>>>            (IPs such as g2 creatively pass parts of the 'eob' counts in
>>>>>>            the 'coeff' counts)
>>>>>> - factored out several repeated portions of code
>>>>>> - minor nitpicks and cleanups
>>>>>>
>>>>>> Andrzej Pietrasiewicz (6):
>>>>>>      media: uapi: Add VP9 stateless decoder controls
>>>>>>      media: Add VP9 v4l2 library
>>>>>>      media: hantro: Rename registers
>>>>>>      media: hantro: Prepare for other G2 codecs
>>>>>>      media: hantro: Support VP9 on the G2 core
>>>>>>      media: hantro: Support NV12 on the G2 core
>>>>>>
>>>>>> Boris Brezillon (1):
>>>>>>      media: rkvdec: Add the VP9 backend
>>>>>>
>>>>>> Ezequiel Garcia (4):
>>>>>>      hantro: postproc: Fix motion vector space size
>>>>>>      hantro: postproc: Introduce struct hantro_postproc_ops
>>>>>>      hantro: Simplify postprocessor
>>>>>>      hantro: Add quirk for NV12/NV12_4L4 capture format
>>>>>>
>>>>>>     .../userspace-api/media/v4l/biblio.rst        |   10 +
>>>>>>     .../media/v4l/ext-ctrls-codec-stateless.rst   |  573 +++++
>>>>>>     .../media/v4l/pixfmt-compressed.rst           |   15 +
>>>>>>     .../media/v4l/vidioc-g-ext-ctrls.rst          |    8 +
>>>>>>     .../media/v4l/vidioc-queryctrl.rst            |   12 +
>>>>>>     .../media/videodev2.h.rst.exceptions          |    2 +
>>>>>>     drivers/media/v4l2-core/Kconfig               |    4 +
>>>>>>     drivers/media/v4l2-core/Makefile              |    1 +
>>>>>>     drivers/media/v4l2-core/v4l2-ctrls-core.c     |  180 ++
>>>>>>     drivers/media/v4l2-core/v4l2-ctrls-defs.c     |    8 +
>>>>>>     drivers/media/v4l2-core/v4l2-ioctl.c          |    1 +
>>>>>>     drivers/media/v4l2-core/v4l2-vp9.c            | 1850 +++++++++++++++++
>>>>>>     drivers/staging/media/hantro/Kconfig          |    1 +
>>>>>>     drivers/staging/media/hantro/Makefile         |    7 +-
>>>>>>     drivers/staging/media/hantro/hantro.h         |   40 +-
>>>>>>     drivers/staging/media/hantro/hantro_drv.c     |   23 +-
>>>>>>     drivers/staging/media/hantro/hantro_g2.c      |   27 +
>>>>>>     .../staging/media/hantro/hantro_g2_hevc_dec.c |   69 +-
>>>>>>     drivers/staging/media/hantro/hantro_g2_regs.h |  132 +-
>>>>>>     .../staging/media/hantro/hantro_g2_vp9_dec.c  |  980 +++++++++
>>>>>>     drivers/staging/media/hantro/hantro_hw.h      |   83 +-
>>>>>>     .../staging/media/hantro/hantro_postproc.c    |   79 +-
>>>>>>     drivers/staging/media/hantro/hantro_v4l2.c    |   20 +
>>>>>>     drivers/staging/media/hantro/hantro_vp9.c     |  240 +++
>>>>>>     drivers/staging/media/hantro/hantro_vp9.h     |  103 +
>>>>>>     drivers/staging/media/hantro/imx8m_vpu_hw.c   |   38 +-
>>>>>>     .../staging/media/hantro/rockchip_vpu_hw.c    |    7 +-
>>>>>>     .../staging/media/hantro/sama5d4_vdec_hw.c    |    3 +-
>>>>>>     drivers/staging/media/rkvdec/Kconfig          |    1 +
>>>>>>     drivers/staging/media/rkvdec/Makefile         |    2 +-
>>>>>>     drivers/staging/media/rkvdec/rkvdec-vp9.c     | 1078 ++++++++++
>>>>>>     drivers/staging/media/rkvdec/rkvdec.c         |   52 +-
>>>>>>     drivers/staging/media/rkvdec/rkvdec.h         |   12 +-
>>>>>>     include/media/v4l2-ctrls.h                    |    4 +
>>>>>>     include/media/v4l2-vp9.h                      |  182 ++
>>>>>>     include/uapi/linux/v4l2-controls.h            |  284 +++
>>>>>>     include/uapi/linux/videodev2.h                |    6 +
>>>>>>     37 files changed, 6033 insertions(+), 104 deletions(-)
>>>>>>     create mode 100644 drivers/media/v4l2-core/v4l2-vp9.c
>>>>>>     create mode 100644 drivers/staging/media/hantro/hantro_g2.c
>>>>>>     create mode 100644 drivers/staging/media/hantro/hantro_g2_vp9_dec.c
>>>>>>     create mode 100644 drivers/staging/media/hantro/hantro_vp9.c
>>>>>>     create mode 100644 drivers/staging/media/hantro/hantro_vp9.h
>>>>>>     create mode 100644 drivers/staging/media/rkvdec/rkvdec-vp9.c
>>>>>>     create mode 100644 include/media/v4l2-vp9.h
>>>>>>
>>>>>>
>>>>>> base-commit: e4e737bb5c170df6135a127739a9e6148ee3da82
>>>>>>
>>>>>
>>>>
>>>
>>
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 00/11] VP9 codec V4L2 control interface
  2021-11-16 13:14           ` Andrzej Pietrasiewicz
@ 2021-11-17  9:59             ` Hans Verkuil
  2021-11-17 10:49               ` Andrzej Pietrasiewicz
  0 siblings, 1 reply; 37+ messages in thread
From: Hans Verkuil @ 2021-11-17  9:59 UTC (permalink / raw)
  To: Andrzej Pietrasiewicz, linux-media, linux-arm-kernel,
	linux-kernel, linux-rockchip, linux-staging
  Cc: Benjamin Gaignard, Boris Brezillon, Ezequiel Garcia,
	Fabio Estevam, Greg Kroah-Hartman, Heiko Stuebner,
	Jernej Skrabec, Mauro Carvalho Chehab, Nicolas Dufresne,
	NXP Linux Team, Pengutronix Kernel Team, Philipp Zabel,
	Sascha Hauer, Shawn Guo, kernel

On 16/11/2021 14:14, Andrzej Pietrasiewicz wrote:
> Hi,
> 
> W dniu 16.11.2021 o 09:21, Hans Verkuil pisze:
>> On 16/11/2021 09:09, Andrzej Pietrasiewicz wrote:
>>> Hi Hans,
>>>
>>> W dniu 15.11.2021 o 22:16, Hans Verkuil pisze:
>>>> On 15/11/2021 18:14, Andrzej Pietrasiewicz wrote:
>>>>> Hi Hans,
>>>>>
>>>>> W dniu 15.11.2021 o 16:07, Hans Verkuil pisze:
>>>>>> Andrzej,
>>>>>>
>>>>>> Can you rebase this series on top of the master branch of
>>>>>> https://git.linuxtv.org/media_stage.git/ ? Unfortunately this v7 no longer
>>>>>> applies. Specifically "rkvdec: Add the VP9 backend" failed in a non-trivial
>>>>>> manner.
>>>>>
>>>>> This is a branch for you:
>>>>>
>>>>> https://gitlab.collabora.com/linux/for-upstream/-/tree/vp9-uapi
>>>>
>>>> I'm getting a bunch of sparse/smatch warnings:
>>>>
>>>
>>> Thanks for finding this, I will re-create the branch and let you know on irc.
>>> Some of the below are "false positives, namely:
>>>
>>> drivers/media/platform/omap3isp/omap3isp.h
>>> drivers/media/platform/qcom/venus/core.h
>>
>> Ah, sorry, I though I had filtered those out. Obviously you can ignore those.
>>
>> Please post a v8. That way the series is archived on lore. And it works better
>> with patchwork.
> 
> Sure, no problem. Also please see below.
> 
>>
>> Regards,
>>
>> 	Hans
>>
>>>
>>> which are not touched by the series.
>>>
>>> Regards,
>>>
>>> Andrzej
>>>
>>>> sparse:
>>>> rkvdec/rkvdec-vp9.c:190:43: warning: variable 'dec_params' set but not used [-Wunused-but-set-variable]
>>>> rkvdec/rkvdec-vp9.c:245:43: warning: variable 'dec_params' set but not used [-Wunused-but-set-variable]
>>>> SPARSE:hantro/hantro_postproc.c hantro/hantro_postproc.c:37:35: warning: symbol 'hantro_g1_postproc_regs' was not declared. Should it be static?
>>>>
>>>> smatch:
>>>> rkvdec/rkvdec-vp9.c:190:43: warning: variable 'dec_params' set but not used [-Wunused-but-set-variable]
>>>> rkvdec/rkvdec-vp9.c:245:43: warning: variable 'dec_params' set but not used [-Wunused-but-set-variable]
>>>> rkvdec/rkvdec-vp9.c: rkvdec/rkvdec-vp9.c:236 init_intra_only_probs() error: buffer overflow 'ptr' 90 <= 91
> 
> this looks a false positive.
> 
> A portion of memory pointed to by ptr is indexed with i * 23 + m,
> where i ranges from 0 to 3, inclusive, and m ranges from 0 to 22,
> inclusive if i < 3, otherwise m ranges from 0 to 20, inclusive.
> So the largest index value we compute equals 89 (3 * 23 + 20).
> Because ptr points to something that is at least 90 bytes large,
> 89 is a valid index and no greater index will be ever computed.

But we do need to get rid of this smatch warning, otherwise it will pollute the
list of smatch warnings.

I was looking at the code and wonder if it wouldn't make more sense to
move writing to rkprobs->intra_mode[i].uv_mode[] into a separate for loop:

        for (i = 0; i < ARRAY_SIZE(v4l2_vp9_kf_uv_mode_prob); i++)
                rkprobs->intra_mode[i / 23].uv_mode[i % 23] = v4l2_vp9_kf_uv_mode_prob[i];

Wouldn't that do the same as the current code? It looks simpler as well.

Regards,

	Hans

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 00/11] VP9 codec V4L2 control interface
  2021-11-17  9:59             ` Hans Verkuil
@ 2021-11-17 10:49               ` Andrzej Pietrasiewicz
  2021-11-17 10:51                 ` Andrzej Pietrasiewicz
  0 siblings, 1 reply; 37+ messages in thread
From: Andrzej Pietrasiewicz @ 2021-11-17 10:49 UTC (permalink / raw)
  To: Hans Verkuil, linux-media, linux-arm-kernel, linux-kernel,
	linux-rockchip, linux-staging
  Cc: Benjamin Gaignard, Boris Brezillon, Ezequiel Garcia,
	Fabio Estevam, Greg Kroah-Hartman, Heiko Stuebner,
	Jernej Skrabec, Mauro Carvalho Chehab, Nicolas Dufresne,
	NXP Linux Team, Pengutronix Kernel Team, Philipp Zabel,
	Sascha Hauer, Shawn Guo, kernel

Hi,

W dniu 17.11.2021 o 10:59, Hans Verkuil pisze:
> On 16/11/2021 14:14, Andrzej Pietrasiewicz wrote:
>> Hi,
>>
>> W dniu 16.11.2021 o 09:21, Hans Verkuil pisze:
>>> On 16/11/2021 09:09, Andrzej Pietrasiewicz wrote:
>>>> Hi Hans,
>>>>
>>>> W dniu 15.11.2021 o 22:16, Hans Verkuil pisze:
>>>>> On 15/11/2021 18:14, Andrzej Pietrasiewicz wrote:
>>>>>> Hi Hans,
>>>>>>
>>>>>> W dniu 15.11.2021 o 16:07, Hans Verkuil pisze:
>>>>>>> Andrzej,
>>>>>>>
>>>>>>> Can you rebase this series on top of the master branch of
>>>>>>> https://git.linuxtv.org/media_stage.git/ ? Unfortunately this v7 no longer
>>>>>>> applies. Specifically "rkvdec: Add the VP9 backend" failed in a non-trivial
>>>>>>> manner.
>>>>>>
>>>>>> This is a branch for you:
>>>>>>
>>>>>> https://gitlab.collabora.com/linux/for-upstream/-/tree/vp9-uapi
>>>>>
>>>>> I'm getting a bunch of sparse/smatch warnings:
>>>>>
>>>>
>>>> Thanks for finding this, I will re-create the branch and let you know on irc.
>>>> Some of the below are "false positives, namely:
>>>>
>>>> drivers/media/platform/omap3isp/omap3isp.h
>>>> drivers/media/platform/qcom/venus/core.h
>>>
>>> Ah, sorry, I though I had filtered those out. Obviously you can ignore those.
>>>
>>> Please post a v8. That way the series is archived on lore. And it works better
>>> with patchwork.
>>
>> Sure, no problem. Also please see below.
>>
>>>
>>> Regards,
>>>
>>> 	Hans
>>>
>>>>
>>>> which are not touched by the series.
>>>>
>>>> Regards,
>>>>
>>>> Andrzej
>>>>
>>>>> sparse:
>>>>> rkvdec/rkvdec-vp9.c:190:43: warning: variable 'dec_params' set but not used [-Wunused-but-set-variable]
>>>>> rkvdec/rkvdec-vp9.c:245:43: warning: variable 'dec_params' set but not used [-Wunused-but-set-variable]
>>>>> SPARSE:hantro/hantro_postproc.c hantro/hantro_postproc.c:37:35: warning: symbol 'hantro_g1_postproc_regs' was not declared. Should it be static?
>>>>>
>>>>> smatch:
>>>>> rkvdec/rkvdec-vp9.c:190:43: warning: variable 'dec_params' set but not used [-Wunused-but-set-variable]
>>>>> rkvdec/rkvdec-vp9.c:245:43: warning: variable 'dec_params' set but not used [-Wunused-but-set-variable]
>>>>> rkvdec/rkvdec-vp9.c: rkvdec/rkvdec-vp9.c:236 init_intra_only_probs() error: buffer overflow 'ptr' 90 <= 91
>>
>> this looks a false positive.
>>
>> A portion of memory pointed to by ptr is indexed with i * 23 + m,
>> where i ranges from 0 to 3, inclusive, and m ranges from 0 to 22,
>> inclusive if i < 3, otherwise m ranges from 0 to 20, inclusive.
>> So the largest index value we compute equals 89 (3 * 23 + 20).
>> Because ptr points to something that is at least 90 bytes large,
>> 89 is a valid index and no greater index will be ever computed.
> 
> But we do need to get rid of this smatch warning, otherwise it will pollute the
> list of smatch warnings.
> 
> I was looking at the code and wonder if it wouldn't make more sense to
> move writing to rkprobs->intra_mode[i].uv_mode[] into a separate for loop:
> 
>          for (i = 0; i < ARRAY_SIZE(v4l2_vp9_kf_uv_mode_prob); i++)
>                  rkprobs->intra_mode[i / 23].uv_mode[i % 23] = v4l2_vp9_kf_uv_mode_prob[i];
> 
> Wouldn't that do the same as the current code? It looks simpler as well.
> 

I think it would, but I would slightly change the loop:

	for (i = 0; i < ARRAY_SIZE(v4l2_vp9_kf_uv_mode_prob); i++) {
		const u8 *ptr = (const u8 *)v4l2_vp9_kf_uv_mode_prob;

		rkprobs->intra_mode[i / 23].uv_mode[i % 23] = ptr[i];
	}

because v4l2_vp9_kf_uv_mode_prob is actually a u8[10][9].

I will make such a change locally and test whether it causes regressions.

Once I confirm it works (and I expect I will) would you like me to post a v9,
only reply to the changed patch with its updated version or do you want to make 
this change yourself?

Andrzej

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 00/11] VP9 codec V4L2 control interface
  2021-11-17 10:49               ` Andrzej Pietrasiewicz
@ 2021-11-17 10:51                 ` Andrzej Pietrasiewicz
  2021-11-17 11:33                   ` Andrzej Pietrasiewicz
  0 siblings, 1 reply; 37+ messages in thread
From: Andrzej Pietrasiewicz @ 2021-11-17 10:51 UTC (permalink / raw)
  To: Hans Verkuil, linux-media, linux-arm-kernel, linux-kernel,
	linux-rockchip, linux-staging
  Cc: Benjamin Gaignard, Boris Brezillon, Ezequiel Garcia,
	Fabio Estevam, Greg Kroah-Hartman, Heiko Stuebner,
	Jernej Skrabec, Mauro Carvalho Chehab, Nicolas Dufresne,
	NXP Linux Team, Pengutronix Kernel Team, Philipp Zabel,
	Sascha Hauer, Shawn Guo, kernel

Hi again,

W dniu 17.11.2021 o 11:49, Andrzej Pietrasiewicz pisze:
> Hi,
> 
> W dniu 17.11.2021 o 10:59, Hans Verkuil pisze:
>> On 16/11/2021 14:14, Andrzej Pietrasiewicz wrote:
>>> Hi,
>>>
>>> W dniu 16.11.2021 o 09:21, Hans Verkuil pisze:
>>>> On 16/11/2021 09:09, Andrzej Pietrasiewicz wrote:
>>>>> Hi Hans,
>>>>>
>>>>> W dniu 15.11.2021 o 22:16, Hans Verkuil pisze:
>>>>>> On 15/11/2021 18:14, Andrzej Pietrasiewicz wrote:
>>>>>>> Hi Hans,
>>>>>>>
>>>>>>> W dniu 15.11.2021 o 16:07, Hans Verkuil pisze:
>>>>>>>> Andrzej,
>>>>>>>>
>>>>>>>> Can you rebase this series on top of the master branch of
>>>>>>>> https://git.linuxtv.org/media_stage.git/ ? Unfortunately this v7 no longer
>>>>>>>> applies. Specifically "rkvdec: Add the VP9 backend" failed in a non-trivial
>>>>>>>> manner.
>>>>>>>
>>>>>>> This is a branch for you:
>>>>>>>
>>>>>>> https://gitlab.collabora.com/linux/for-upstream/-/tree/vp9-uapi
>>>>>>
>>>>>> I'm getting a bunch of sparse/smatch warnings:
>>>>>>
>>>>>
>>>>> Thanks for finding this, I will re-create the branch and let you know on irc.
>>>>> Some of the below are "false positives, namely:
>>>>>
>>>>> drivers/media/platform/omap3isp/omap3isp.h
>>>>> drivers/media/platform/qcom/venus/core.h
>>>>
>>>> Ah, sorry, I though I had filtered those out. Obviously you can ignore those.
>>>>
>>>> Please post a v8. That way the series is archived on lore. And it works better
>>>> with patchwork.
>>>
>>> Sure, no problem. Also please see below.
>>>
>>>>
>>>> Regards,
>>>>
>>>>     Hans
>>>>
>>>>>
>>>>> which are not touched by the series.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Andrzej
>>>>>
>>>>>> sparse:
>>>>>> rkvdec/rkvdec-vp9.c:190:43: warning: variable 'dec_params' set but not 
>>>>>> used [-Wunused-but-set-variable]
>>>>>> rkvdec/rkvdec-vp9.c:245:43: warning: variable 'dec_params' set but not 
>>>>>> used [-Wunused-but-set-variable]
>>>>>> SPARSE:hantro/hantro_postproc.c hantro/hantro_postproc.c:37:35: warning: 
>>>>>> symbol 'hantro_g1_postproc_regs' was not declared. Should it be static?
>>>>>>
>>>>>> smatch:
>>>>>> rkvdec/rkvdec-vp9.c:190:43: warning: variable 'dec_params' set but not 
>>>>>> used [-Wunused-but-set-variable]
>>>>>> rkvdec/rkvdec-vp9.c:245:43: warning: variable 'dec_params' set but not 
>>>>>> used [-Wunused-but-set-variable]
>>>>>> rkvdec/rkvdec-vp9.c: rkvdec/rkvdec-vp9.c:236 init_intra_only_probs() 
>>>>>> error: buffer overflow 'ptr' 90 <= 91
>>>
>>> this looks a false positive.
>>>
>>> A portion of memory pointed to by ptr is indexed with i * 23 + m,
>>> where i ranges from 0 to 3, inclusive, and m ranges from 0 to 22,
>>> inclusive if i < 3, otherwise m ranges from 0 to 20, inclusive.
>>> So the largest index value we compute equals 89 (3 * 23 + 20).
>>> Because ptr points to something that is at least 90 bytes large,
>>> 89 is a valid index and no greater index will be ever computed.
>>
>> But we do need to get rid of this smatch warning, otherwise it will pollute the
>> list of smatch warnings.
>>
>> I was looking at the code and wonder if it wouldn't make more sense to
>> move writing to rkprobs->intra_mode[i].uv_mode[] into a separate for loop:
>>
>>          for (i = 0; i < ARRAY_SIZE(v4l2_vp9_kf_uv_mode_prob); i++)
>>                  rkprobs->intra_mode[i / 23].uv_mode[i % 23] = 
>> v4l2_vp9_kf_uv_mode_prob[i];
>>
>> Wouldn't that do the same as the current code? It looks simpler as well.
>>
> 
> I think it would, but I would slightly change the loop:
>
>      for (i = 0; i < ARRAY_SIZE(v4l2_vp9_kf_uv_mode_prob); i++) {

actually, sizeof(v4l2_vp9_kf_uv_mode_prob)



>          const u8 *ptr = (const u8 *)v4l2_vp9_kf_uv_mode_prob;
> 
>          rkprobs->intra_mode[i / 23].uv_mode[i % 23] = ptr[i];
>      }
> 
> because v4l2_vp9_kf_uv_mode_prob is actually a u8[10][9].
> 
> I will make such a change locally and test whether it causes regressions.
> 
> Once I confirm it works (and I expect I will) would you like me to post a v9,
> only reply to the changed patch with its updated version or do you want to make 
> this change yourself?
> 
> Andrzej


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 00/11] VP9 codec V4L2 control interface
  2021-11-17 10:51                 ` Andrzej Pietrasiewicz
@ 2021-11-17 11:33                   ` Andrzej Pietrasiewicz
  0 siblings, 0 replies; 37+ messages in thread
From: Andrzej Pietrasiewicz @ 2021-11-17 11:33 UTC (permalink / raw)
  To: Hans Verkuil, linux-media, linux-arm-kernel, linux-kernel,
	linux-rockchip, linux-staging
  Cc: Benjamin Gaignard, Boris Brezillon, Ezequiel Garcia,
	Fabio Estevam, Greg Kroah-Hartman, Heiko Stuebner,
	Jernej Skrabec, Mauro Carvalho Chehab, Nicolas Dufresne,
	NXP Linux Team, Pengutronix Kernel Team, Philipp Zabel,
	Sascha Hauer, Shawn Guo, kernel

Hi Hans,

W dniu 17.11.2021 o 11:51, Andrzej Pietrasiewicz pisze:
> Hi again,
> 
> W dniu 17.11.2021 o 11:49, Andrzej Pietrasiewicz pisze:
>> Hi,
>>
>> W dniu 17.11.2021 o 10:59, Hans Verkuil pisze:
>>> On 16/11/2021 14:14, Andrzej Pietrasiewicz wrote:
>>>> Hi,
>>>>
>>>> W dniu 16.11.2021 o 09:21, Hans Verkuil pisze:
>>>>> On 16/11/2021 09:09, Andrzej Pietrasiewicz wrote:
>>>>>> Hi Hans,
>>>>>>
>>>>>> W dniu 15.11.2021 o 22:16, Hans Verkuil pisze:
>>>>>>> On 15/11/2021 18:14, Andrzej Pietrasiewicz wrote:
>>>>>>>> Hi Hans,
>>>>>>>>
>>>>>>>> W dniu 15.11.2021 o 16:07, Hans Verkuil pisze:
>>>>>>>>> Andrzej,
>>>>>>>>>
>>>>>>>>> Can you rebase this series on top of the master branch of
>>>>>>>>> https://git.linuxtv.org/media_stage.git/ ? Unfortunately this v7 no longer
>>>>>>>>> applies. Specifically "rkvdec: Add the VP9 backend" failed in a 
>>>>>>>>> non-trivial
>>>>>>>>> manner.
>>>>>>>>
>>>>>>>> This is a branch for you:
>>>>>>>>
>>>>>>>> https://gitlab.collabora.com/linux/for-upstream/-/tree/vp9-uapi
>>>>>>>
>>>>>>> I'm getting a bunch of sparse/smatch warnings:
>>>>>>>
>>>>>>
>>>>>> Thanks for finding this, I will re-create the branch and let you know on irc.
>>>>>> Some of the below are "false positives, namely:
>>>>>>
>>>>>> drivers/media/platform/omap3isp/omap3isp.h
>>>>>> drivers/media/platform/qcom/venus/core.h
>>>>>
>>>>> Ah, sorry, I though I had filtered those out. Obviously you can ignore those.
>>>>>
>>>>> Please post a v8. That way the series is archived on lore. And it works better
>>>>> with patchwork.
>>>>
>>>> Sure, no problem. Also please see below.
>>>>
>>>>>
>>>>> Regards,
>>>>>
>>>>>     Hans
>>>>>
>>>>>>
>>>>>> which are not touched by the series.
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Andrzej
>>>>>>
>>>>>>> sparse:
>>>>>>> rkvdec/rkvdec-vp9.c:190:43: warning: variable 'dec_params' set but not 
>>>>>>> used [-Wunused-but-set-variable]
>>>>>>> rkvdec/rkvdec-vp9.c:245:43: warning: variable 'dec_params' set but not 
>>>>>>> used [-Wunused-but-set-variable]
>>>>>>> SPARSE:hantro/hantro_postproc.c hantro/hantro_postproc.c:37:35: warning: 
>>>>>>> symbol 'hantro_g1_postproc_regs' was not declared. Should it be static?
>>>>>>>
>>>>>>> smatch:
>>>>>>> rkvdec/rkvdec-vp9.c:190:43: warning: variable 'dec_params' set but not 
>>>>>>> used [-Wunused-but-set-variable]
>>>>>>> rkvdec/rkvdec-vp9.c:245:43: warning: variable 'dec_params' set but not 
>>>>>>> used [-Wunused-but-set-variable]
>>>>>>> rkvdec/rkvdec-vp9.c: rkvdec/rkvdec-vp9.c:236 init_intra_only_probs() 
>>>>>>> error: buffer overflow 'ptr' 90 <= 91
>>>>
>>>> this looks a false positive.
>>>>
>>>> A portion of memory pointed to by ptr is indexed with i * 23 + m,
>>>> where i ranges from 0 to 3, inclusive, and m ranges from 0 to 22,
>>>> inclusive if i < 3, otherwise m ranges from 0 to 20, inclusive.
>>>> So the largest index value we compute equals 89 (3 * 23 + 20).
>>>> Because ptr points to something that is at least 90 bytes large,
>>>> 89 is a valid index and no greater index will be ever computed.
>>>
>>> But we do need to get rid of this smatch warning, otherwise it will pollute the
>>> list of smatch warnings.
>>>
>>> I was looking at the code and wonder if it wouldn't make more sense to
>>> move writing to rkprobs->intra_mode[i].uv_mode[] into a separate for loop:
>>>
>>>          for (i = 0; i < ARRAY_SIZE(v4l2_vp9_kf_uv_mode_prob); i++)
>>>                  rkprobs->intra_mode[i / 23].uv_mode[i % 23] = 
>>> v4l2_vp9_kf_uv_mode_prob[i];
>>>
>>> Wouldn't that do the same as the current code? It looks simpler as well.
>>>
>>
>> I think it would, but I would slightly change the loop:
>>
>>      for (i = 0; i < ARRAY_SIZE(v4l2_vp9_kf_uv_mode_prob); i++) {
> 
> actually, sizeof(v4l2_vp9_kf_uv_mode_prob)
> 
> 
> 
>>          const u8 *ptr = (const u8 *)v4l2_vp9_kf_uv_mode_prob;
>>
>>          rkprobs->intra_mode[i / 23].uv_mode[i % 23] = ptr[i];
>>      }
>>
>> because v4l2_vp9_kf_uv_mode_prob is actually a u8[10][9].
>>
>> I will make such a change locally and test whether it causes regressions.

This worked, no regressions:

	for (i = 0; i < sizeof(v4l2_vp9_kf_uv_mode_prob); ++i) {
		const u8 *ptr = (const u8 *)v4l2_vp9_kf_uv_mode_prob;

		rkprobs->intra_mode[i / 23].uv_mode[i % 23] = ptr[i];
	}

Andrzej

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2021-11-17 11:33 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-29 16:04 [PATCH v7 00/11] VP9 codec V4L2 control interface Andrzej Pietrasiewicz
2021-09-29 16:04 ` [PATCH v7 01/11] hantro: postproc: Fix motion vector space size Andrzej Pietrasiewicz
2021-09-29 16:04 ` [PATCH v7 02/11] hantro: postproc: Introduce struct hantro_postproc_ops Andrzej Pietrasiewicz
2021-09-29 16:04 ` [PATCH v7 03/11] hantro: Simplify postprocessor Andrzej Pietrasiewicz
2021-09-29 16:04 ` [PATCH v7 04/11] hantro: Add quirk for NV12/NV12_4L4 capture format Andrzej Pietrasiewicz
2021-09-29 16:04 ` [PATCH v7 05/11] media: uapi: Add VP9 stateless decoder controls Andrzej Pietrasiewicz
2021-09-29 16:04 ` [PATCH v7 06/11] media: Add VP9 v4l2 library Andrzej Pietrasiewicz
2021-09-29 16:04 ` [PATCH v7 07/11] media: rkvdec: Add the VP9 backend Andrzej Pietrasiewicz
2021-10-08 10:30   ` Chen-Yu Tsai
2021-10-19 23:24   ` Alex Bee
2021-10-20 13:07     ` Andrzej Pietrasiewicz
2021-09-29 16:04 ` [PATCH v7 08/11] media: hantro: Rename registers Andrzej Pietrasiewicz
2021-09-29 16:04 ` [PATCH v7 09/11] media: hantro: Prepare for other G2 codecs Andrzej Pietrasiewicz
2021-09-29 16:04 ` [PATCH v7 10/11] media: hantro: Support VP9 on the G2 core Andrzej Pietrasiewicz
2021-09-29 16:04 ` [PATCH v7 11/11] media: hantro: Support NV12 " Andrzej Pietrasiewicz
2021-10-14 17:42   ` Jernej Škrabec
2021-10-15 17:19     ` Andrzej Pietrasiewicz
2021-10-19 16:38       ` Jernej Škrabec
2021-10-20 11:06         ` Ezequiel Garcia
2021-10-20 15:04           ` Jernej Škrabec
2021-10-20 15:25             ` Ezequiel Garcia
2021-10-21 15:36               ` Jernej Škrabec
2021-10-19 17:55 ` [PATCH v7 00/11] VP9 codec V4L2 control interface Ezequiel Garcia
2021-11-11 14:44 ` Hans Verkuil
2021-11-12 15:27   ` Nicolas Dufresne
2021-11-15 12:56     ` Andrzej Pietrasiewicz
2021-11-15 13:09       ` Andrzej Pietrasiewicz
2021-11-15 15:07 ` Hans Verkuil
2021-11-15 17:14   ` Andrzej Pietrasiewicz
2021-11-15 21:16     ` Hans Verkuil
2021-11-16  8:09       ` Andrzej Pietrasiewicz
2021-11-16  8:21         ` Hans Verkuil
2021-11-16 13:14           ` Andrzej Pietrasiewicz
2021-11-17  9:59             ` Hans Verkuil
2021-11-17 10:49               ` Andrzej Pietrasiewicz
2021-11-17 10:51                 ` Andrzej Pietrasiewicz
2021-11-17 11:33                   ` Andrzej Pietrasiewicz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).