Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CineTransfer: Controlling a Robot to Imitate Cinematographic Style from a Single Example (2310.03953v1)

Published 6 Oct 2023 in cs.RO

Abstract: This work presents CineTransfer, an algorithmic framework that drives a robot to record a video sequence that mimics the cinematographic style of an input video. We propose features that abstract the aesthetic style of the input video, so the robot can transfer this style to a scene with visual details that are significantly different from the input video. The framework builds upon CineMPC, a tool that allows users to control cinematographic features, like subjects' position on the image and the depth of field, by manipulating the intrinsics and extrinsics of a cinematographic camera. However, CineMPC requires a human expert to specify the desired style of the shot (composition, camera motion, zoom, focus, etc). CineTransfer bridges this gap, aiming a fully autonomous cinematographic platform. The user chooses a single input video as a style guide. CineTransfer extracts and optimizes two important style features, the composition of the subject in the image and the scene depth of field, and provides instructions for CineMPC to control the robot to record an output sequence that matches these features as closely as possible. In contrast with other style transfer methods, our approach is a lightweight and portable framework which does not require deep network training or extensive datasets. Experiments with real and simulated videos demonstrate the system's ability to analyze and transfer style between recordings, and are available in the supplementary video.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (20)
  1. R. Bonatti, W. Wang, C. Ho, A. Ahuja, M. Gschwindt, E. Camci, E. Kayacan, S. Choudhury, and S. Scherer, “Autonomous aerial cinematography in unstructured environments with learned artistic decision-making,” Journal of Field Robotics, vol. 37, no. 4, pp. 606–641, 2020.
  2. A. Alcántara, J. Capitán, A. Torres-González, R. Cunha, and A. Ollero, “Autonomous execution of cinematographic shots with multiple drones,” IEEE Access, vol. 8, pp. 201 300–201 316, 2020.
  3. P. Pueyo, E. Montijano, A. C. Murillo, and M. Schwager, “Cinempc: Controlling camera intrinsics and extrinsics for autonomous cinematography,” in IEEE ICRA, 2022, pp. 4058–4064.
  4. F. Janabi-Sharifi, L. Deng, and W. J. Wilson, “Comparison of basic visual servoing methods,” IEEE/ASME Trans. on Mechatronics, vol. 16, no. 5, pp. 967–983, 2010.
  5. Z. Chen and S. T. Birchfield, “Qualitative vision-based path following,” IEEE Trans. on Robotics, vol. 25, no. 3, pp. 749–754, 2009.
  6. Y. Matsumoto, M. Inaba, and H. Inoue, “Visual navigation using view-sequenced route representation,” in IEEE ICRA, 1996, pp. 83–88.
  7. S. Paradis, M. Hwang, B. Thananjeyan, J. Ichnowski, D. Seita, D. Fer, T. Low, J. E. Gonzalez, and K. Goldberg, “Intermittent visual servoing: Efficiently learning policies robust to instrument changes for high-precision surgical manipulation,” in IEEE ICRA, 2021, pp. 7166–7173.
  8. Y. Dang, C. Huang, P. Chen, R. Liang, X. Yang, and K.-T. Cheng, “Imitation learning-based algorithm for drone cinematography system,” IEEE Trans. on Cognitive and Dev. Systems, pp. 403–413, 2020.
  9. Y. Dang, C. Huang, P. Chen, R. Liang, X. Yang, and K. Cheng, “Path-analysis-based reinforcement learning algorithm for imitation filming,” in IEEE Trans. on Multimedia, 2023.
  10. H. Jiang, M. Christie, X. Wang, L. Liu, B. Wang, and B. Chen, “Camera keyframing with style and control,” ACM Trans. on Graphics, vol. 40, no. 6, pp. 1–13, 2021.
  11. C. Huang, C.-E. Lin, Z. Yang, Y. Kong, P. Chen, X. Yang, and K.-T. Cheng, “Learning to film from professional human motion videos,” in IEEE/CVF CVPR, 2019, pp. 4244–4253.
  12. C. Huang, Z. Yang, Y. Kong, P. Chen, X. Yang, and K.-T. T. Cheng, “Learning to capture a film-look video with a camera drone,” in IEEE ICRA, 2019, pp. 1871–1877.
  13. C. Huang, Y. Dang, P. Chen, X. Yang, and K.-T. Cheng, “One-shot imitation drone filming of human motion videos,” IEEE TPAMI, vol. 44, no. 9, pp. 5335–5348, 2021.
  14. W. Abdulla, “Mask r-cnn for object detection and segmentation on keras and tensorflow,” https://github.com/matterport/Mask_RCNN.
  15. Z. Cao, G. Hidalgo Martinez, T. Simon, S. Wei, and Y. A. Sheikh, “Openpose: Realtime multi-person 2d pose estimation using part affinity fields,” CoRR, 2019. [Online]. Available: http://arxiv.org/abs/1812.08008
  16. J. Li, C. Wang, H. Zhu, Y. Mao, H.-S. Fang, and C. Lu, “Crowdpose: Efficient crowded scenes pose estimation and a new benchmark,” in IEEE/CVF CVPR, 2019, pp. 10 863–10 872.
  17. S. Wang, D. Yang, B. Wang, Z. Guo, R. Verma, J. Ramesh, C. Weinrich, U. Kreßel, and F. B. Flohr, “Urbanpose: A new benchmark for vru pose estimation in urban traffic scenes,” in IEEE IV, 2021, pp. 1537–1544.
  18. W. Zhao, X. Hou, Y. He, and H. Lu, “Defocus blur detection via boosting diversity of deep ensemble networks,” IEEE Trans. on Image Processing, vol. 30, pp. 5426–5438, 2021.
  19. P. Pueyo, E. Cristofalo, E. Montijano, and M. Schwager, “Cinemairsim: A camera-realistic robotics simulator for cinematographic purposes,” IEEE/RSJ IROS, pp. 1186–1191, 2020.
  20. “Gekko optimization suite,” https://gekko.readthedocs.io/en/latest/#.
Citations (2)

Summary

We haven't generated a summary for this paper yet.