Overview

Advanced Multi-modal Foundation Models (MFMs) and AI Agents, equipped with diverse modalities [15, 26, 19, 25, 29] and an expanding array of functionalities [20, 31] (e.g., tool utilization, UI assistance, API accessibility, embodied AI control, etc.), have the potential to accelerate and magnify the societal impact of their predecessors greatly [9, 42], e.g., autonomous driving [45]. MFMs consist of multi-modal large language models (MLLMs) and multi-modal generative models (MMGMs). MLLMs denote language-based models capable of processing, reasoning, and generating outputs from multiple modalities, encompassing text, images, audio, and video. Prominent examples include Reka [48], QwenVL [24], and Llava [25]. MMGMs represent a category of MFM models adept at creating new visual content across various modalities, such as synthesizing images from textual descriptions or producing videos from audio and textual inputs. Noteworthy instances comprise Stable Diffusion [26], Sora, and Latte [27]. AI Agents operating in embodied AI robotics [20, 46], autonomous driving [30, 31], and intelligent electronic systems [42] are assuming increasingly pivotal roles. Consequently, comprehending and preemptively addressing the vulnerabilities inherent in these systems [4], the associated risks [6], and the corresponding threat posed by synthetic AI-generated content [33, 36] have become of paramount importance. The development of trustworthy MFMs and AI Agents extends beyond bolstering their adversarial robustness; it underscores the criticality of proactive risk assessment, mitigation measures, safeguards, and the implementation of comprehensive safety protocols across the entire lifecycle of system development and deployment [24, 28]. This approach necessitates a fusion of technical and socio-technical strategies, integrating AI governance (e.g., DeepFake detection [33-40]) and regulatory insights to foster the trustworthiness of MFMs and AI Agents.

Call for Papers

Topics include but are not limited to:

  • Fairness in healthcare
  • AI-generated content detection [33-40]
  • Privacy, copyright, and watermarking in foundation models [14, 17, 32]
  • Fairness, accountability, and regulation [6, 18]
  • Explainable AI and monitoring [16, 47]
  • Safe Content Generation in Diffusion Models [28, 29]
  • Robustness, attack and defense, poisoning, hijacking, and security [8, 1, 4, 13, 22, 27]
  • Safety of foundation models in autonomous driving [30, 31]
  • Measures against malicious model fine-tuning [11]
  • Multimodal AI Agents Design and corresponding safety governance. [42-46]

Speakers

The symbol "*" denotes a provisional arrangement.

Organizing Committee

Steering Committee

References

[1] L. Bailey, E. Ong, S. Russell, and S. Emmons. Image hijacks: Adversarial images can control generative models at runtime, 2023.

[2] M. Bhatt, S. Chennabasappa, C. Nikolaidis, S. Wan, I. Evtimov, D. Gabi, D. Song, F. Ahmad, C. Aschermann, L. Fontana, S. Frolov, R. P. Giri, D. Kapil, Y. Kozyrakis, D. LeBlanc, J. Milazzo, A. Straumann, G. Synnaeve, V. Vontimitta, S. Whitman, and J. Saxe. Purple llama cyberseceval: A secure coding benchmark for language models, 2023.

[3] S. R. Bowman, J. Hyun, E. Perez, E. Chen, C. Pettit, S. Heiner, K. Lukoˇsi ̄ut ̇e, A. Askell, A. Jones, A. Chen, A. Goldie, A. Mirhoseini, C. McKinnon, C. Olah, D. Amodei, D. Amodei, D. Drain, D. Li, E. Tran-Johnson, J. Kernion, J. Kerr, J. Mueller, J. Ladish, J. Landau, K. Ndousse, L. Lovitt, N. Elhage, N. Schiefer, N. Joseph, N. Mercado, N. DasSarma, R. Larson, S. McCandlish, S. Kundu, S. Johnston, S. Kravec, S. E. Showk, S. Fort, T. Telleen-Lawton, T. Brown, T. Henighan, T. Hume, Y. Bai, Z. Hatfield-Dodds, B. Mann, and J. Kaplan. Measuring progress on scalable oversight for large language models, 2022.

[4] N. Carlini, M. Nasr, C. A. Choquette-Choo, M. Jagielski, I. Gao, A. Awadalla, P. W. Koh, D. Ippolito, K. Lee, F. Tramer, and L. Schmidt. Are aligned neural networks adversarially aligned?, 2023.

[5] S. Casper, C. Ezell, C. Siegmann, N. Kolt, T. L. Curtis, B. Bucknall, A. Haupt, K. Wei, J. Scheurer, M. Hobbhahn, L. Sharkey, S. Krishna, M. V. Hagen, S. Alberti, A. Chan, Q. Sun, M. Gerovitch, D. Bau, M. Tegmark, D. Krueger, and D. Hadfield-Menell. Black-box access is insufficient for rigorous ai audits, 2024.

[6] A. Chan, R. Salganik, A. Markelius, C. Pang, N. Rajkumar, D. Krasheninnikov, L. Langosco, Z. He, Y. Duan, M. Carroll, M. Lin, A. Mayhew, K. Collins, M. Molamohammadi, J. Burden, W. Zhao, S. Rismani, K. Voudouris, U. Bhatt, A. Weller, D. Krueger, and T. Maharaj. Harms from increasingly agentic algorithmic systems. In 2023 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’23. ACM, June 2023. doi: 10.1145/3593013.3594033. URL http://dx.doi.org/10.1145/3593013.3594033.

[7] Y. Chang, X. Wang, J. Wang, Y. Wu, L. Yang, K. Zhu, H. Chen, X. Yi, C. Wang, Y. Wang, W. Ye, Y. Zhang, Y. Chang, P. S. Yu, Q. Yang, and X. Xie. A survey on evaluation of large language models, 2023.

[8] Y. Dong, H. Chen, J. Chen, Z. Fang, X. Yang, Y. Zhang, Y. Tian, H. Su, and J. Zhu. How robust is google’s bard to adversarial image attacks?, 2023.

[9] T. Eloundou, S. Manning, P. Mishkin, and D. Rock. Gpts are gpts: An early look at the labor market impact potential of large language models, 2023.

[10] D. Ganguli, L. Lovitt, J. Kernion, A. Askell, Y. Bai, S. Kadavath, B. Mann, E. Perez, N. Schiefer, K. Ndousse, A. Jones, S. Bowman, A. Chen, T. Conerly, N. DasSarma, D. Drain, N. Elhage, S. El-Showk, S. Fort, Z. Hatfield-Dodds, T. Henighan, D. Hernandez, T. Hume, J. Jacobson, S. Johnston, S. Kravec, C. Olsson, S. Ringer, E. Tran-Johnson, D. Amodei, T. Brown, N. Joseph, S. McCandlish, C. Olah, J. Kaplan, and J. Clark. Red teaming language models to reduce harms: Methods, scaling behaviors, and lessons learned, 2022.

[11] Henderson, P., Mitchell, E., Manning, C., Jurafsky, D., & Finn, C. (2023). Self-destructing models: Increasing the costs of harmful dual uses of foundation models. In Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, AIES ’23.

[12] Huang, Q., Dong, X., Zhang, P., Wang, B., He, C., Wang, J., Lin, D., Zhang, W., & Yu, N. (2023). Opera: Alleviating hallucination in multi-modal large language models via over-trust penalty and retrospection-allocation.

[13] Jain, N., Schwarzschild, A., Wen, Y., Somepalli, G., Kirchenbauer, J., yeh Chiang, P., ... & Goldstein, T. (2023). Baseline defenses for adversarial attacks against aligned language models.

[14] Kirchenbauer, J., Geiping, J., Wen, Y., Katz, J., Miers, I., & Goldstein, T. (2023). A watermark for large language models. In Proceedings of the 40th International Conference on Machine Learning.

[15] Liu, H., Li, C., Wu, Q., & Lee, Y. J. (2023). Visual instruction tuning.

[16] Meng, K., Bau, D., Andonian, A., & Belinkov, Y. (2022). Locating and editing factual associations in GPT.

[17] Nasr, M., Carlini, N., Hayase, J., Jagielski, M., Cooper, A. F., Ippolito, D., ... & Lee, K. (2023). Scalable extraction of training data from (production) language models.

[18] OpenAI. (2023). Practices for governing agentic AI systems.

[19] OpenAI. (2023). GPT-4 with vision (GPT-4v) system card.

[20] Qin, Y., Liang, S., Ye, Y., Zhu, K., Yan, L., Lu, Y., ... & Sun, M. (2023). ToolIIM: Facilitating large language models to master 16000+ real-world APIs.

[21] Rafailov, R., Sharma, A., Mitchell, E., Ermon, S., Manning, C. D., & Finn, C. (2023). Direct preference optimization: Your language model is secretly a reward model.

[22] Robey, A., Wong, E., Hassani, H., & Pappas, G. J. (2023). SmoothLLM: Defending large language models against jailbreaking attacks.

[23] Sharma, M., Tong, M., Korbak, T., Duvenaud, D., Askell, A., Bowman, S. R., ... & Perez, E. (2023). Towards understanding sycophancy in language models.

[24] Bai, J., Bai, S., Yang, S., Wang, S., Tan, S., Wang, P., ... & Zhou, J. (2023). Qwen-vl: A frontier large vision-language model with versatile abilities. arXiv preprint arXiv:2308.12966.

[25] Liu, H., Li, C., Wu, Q., & Lee, Y. J. (2024). Visual instruction tuning. Advances in neural information processing systems, 36.

[26] Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.

[27] Ma, X., Wang, Y., Jia, G., Chen, X., Liu, Z., Li, Y. F., ... & Qiao, Y. (2024). Latte: Latent diffusion transformer for video generation. arXiv preprint arXiv:2401.03048.

[28] P. Schramowski, M. Brack, B. Deiseroth, and K. Kersting, “Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 22 522–22 531.

[29] R. Gandikota, J. Materzynska, J. Fiotto-Kaufman, and D. Bau, “Erasing Concepts from Diffusion Models,” arXiv preprint arXiv:2303.07345, 2023

[30] Aldeen M, MohajerAnsari P, Ma J, et al. WIP: A First Look At Employing Large Multimodal Models Against Autonomous Vehicle Attacks[J].

[31] Cui Y, Huang S, Zhong J, et al. DriveLLM: Charting the path toward full autonomous driving with large language models[J]. IEEE Transactions on Intelligent Vehicles, 2023.

[32] Zeng B, Zhou C, Wang X, et al. HuRef: HUman-REadable Fingerprint for Large Language Models[J]. arXiv preprint arXiv:2312.04828, 2023.

[33] Wang, S. Y., Wang, O., Zhang, R., Owens, A., & Efros, A. A. (2020). CNN-generated images are surprisingly easy to spot... for now. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 8695-8704).

[34] Frank, J., Eisenhofer, T., Schönherr, L., Fischer, A., Kolossa, D., & Holz, T. (2020, November). Leveraging frequency analysis for deep fake image recognition. In International conference on machine learning (pp. 3247-3258). PMLR.

[35] Ju, Y., Jia, S., Ke, L., Xue, H., Nagano, K., & Lyu, S. Fusing Global and Local Features for Generalized AI-Synthesized Image Detection. arXiv 2022. arXiv preprint arXiv:2203.13964.

[36] Liu, Z., Qi, X., & Torr, P. H. (2020). Global texture enhancement for fake face detection in the wild. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 8060-8069).

[37] Tan, C., Zhao, Y., Wei, S., Gu, G., & Wei, Y. (2023). Learning on gradients: Generalized artifacts representation for gan-generated images detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 12105-12114).

[38] Liu, B., Yang, F., Bi, X., Xiao, B., Li, W., & Gao, X. (2022, October). Detecting generated images by real images. In European Conference on Computer Vision (pp. 95-110). Cham: Springer Nature Switzerland.

[39] Wang, Z., Bao, J., Zhou, W., Wang, W., Hu, H., Chen, H., & Li, H. (2023). Dire for diffusion-generated image detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 22445-22455).

[40] Ojha, U., Li, Y., & Lee, Y. J. (2023). Towards universal fake image detectors that generalize across generative models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 24480-24489).

[41] Xu, K., Zhang, L., & Shi, J. (2024). Detecting Image Attribution for Text-to-Image Diffusion Models in RGB and Beyond. arXiv preprint arXiv:2403.19653.

[42] Yang, Z., Liu, J., Han, Y., Chen, X., Huang, Z., Fu, B., & Yu, G. (2023). Appagent: Multimodal agents as smartphone users. arXiv preprint arXiv:2312.13771.

[43] Gu, X., Zheng, X., Pang, T., Du, C., Liu, Q., Wang, Y., ... & Lin, M. (2024). Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast. arXiv preprint arXiv:2402.08567.

[44] Tao, H., TV, S., Shlapentokh-Rothman, M., Hoiem, D., & Ji, H. (2023). Webwise: Web interface control and sequential exploration with large language models. arXiv preprint arXiv:2310.16042.

[45] Mao, J., Qian, Y., Zhao, H., & Wang, Y. (2023). Gpt-driver: Learning to drive with gpt. arXiv preprint arXiv:2310.01415.

[46] Qin, Y., Zhou, E., Liu, Q., Yin, Z., Sheng, L., Zhang, R., ... & Shao, J. (2023). Mp5: A multi-modal open-ended embodied system in minecraft via active perception. arXiv preprint arXiv:2312.07472.

[47] A. Zou, L. Phan, S. Chen, J. Campbell, P. Guo, R. Ren, A. Pan, X. Yin, M. Mazeika, A.-K. Dombrowski, S. Goel, N. Li, M. J. Byun, Z. Wang, A. Mallen, S. Basart, S. Koyejo, D. Song, M. Fredrikson, J. Z. Kolter, and D. Hendrycks. Representation engineering: A top-down approach to ai transparency, 2023.

[48] Reka Team, Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models.