Skip to content
Jan Zahálka

Jan Zahálka

AI & Security

  • Blog
  • Subscribe
  • My workExpand
    • Publications
    • Teaching
Email Twitter Linkedin Github
Jan Zahálka
Jan Zahálka
AI & Security
Email Twitter Linkedin Github

Subscribe

zahalka.net: AI & security blog

What’s more powerful than one adversarial attack?
AI | Science | Security

What’s more powerful than one adversarial attack?

ByJan Zahálka 14 February 2024

Using a single attack won’t do, unless you are in a Hollywood film. This post covers AutoAttack, the pioneer ensemble adversarial attack, and shows how to test the adversarial robustness of AI models more rigorously.

Read More What’s more powerful than one adversarial attack?Continue

Can ChatGPT read who you are?
AI | Science

Can ChatGPT read who you are?

ByJan Zahálka 24 January 202424 January 2024

ChatGPT is excellent in extracting structured information from text. Can it evaluate our personality traits? This post describes our work on LLM personality assessment, accepted to the CAIHu workshop @ AAAI ’24.

Read More Can ChatGPT read who you are?Continue

Elves explain how to understand adversarial attacks
AI | Security

Elves explain how to understand adversarial attacks

ByJan Zahálka 10 January 2024

Intuitive understanding of adversarial attacks is core for understanding AI security. This post aims to explain adversarial attacks with… Elves (instead of technical terminology).

Read More Elves explain how to understand adversarial attacksContinue

A cyberattacker’s little helper: Jailbreaking LLM security
Science

A cyberattacker’s little helper: Jailbreaking LLM security

ByJan Zahálka 20 December 202320 December 2023

Attacks, lies, and deceit to bypass the security of (an older version of) ChatGPT. Jailbreaking is an open LLM security challenge, as LLM services should not assist in malicious activity.

Read More A cyberattacker’s little helper: Jailbreaking LLM securityContinue

Judging LLM security: How to make sure large language models are helping us?
Science

Judging LLM security: How to make sure large language models are helping us?

ByJan Zahálka 29 November 20231 December 2023

Large language models (LLMs) have taken the world by storm, but LLM security is still in its infancy. Read about our contribution: a comprehensive, practical LLM security taxonomy.

Read More Judging LLM security: How to make sure large language models are helping us?Continue

AI security @ CVPR ’23: Honza’s highlights & conclusion
CVPR '23 | Science

AI security @ CVPR ’23: Honza’s highlights & conclusion

ByJan Zahálka 1 November 2023

This post presents “Honza’s highlights”—CVPR ’23 AI security papers that are worthy of your attention and have not received the official highlight status—and conclusions from CVPR ’23.

Read More AI security @ CVPR ’23: Honza’s highlights & conclusionContinue

Reality can be lying: Deepfakes and image manipulation @ CVPR ’23
CVPR '23 | Science

Reality can be lying: Deepfakes and image manipulation @ CVPR ’23

ByJan Zahálka 18 October 2023

Deepfakes & image manipulation are increasingly used for spreading fake news or falsely incriminating people, presenting a security and privacy threat. This post summarizes CVPR ’23 work on the topic.

Read More Reality can be lying: Deepfakes and image manipulation @ CVPR ’23Continue

Privacy attacks @ CVPR ’23: How to steal models and data
CVPR '23 | Science

Privacy attacks @ CVPR ’23: How to steal models and data

ByJan Zahálka 4 October 2023

This post summarizes CVPR ’23 work on privacy attacks that threaten to steal an AI model (model stealing) or its training data (model inversion).

Read More Privacy attacks @ CVPR ’23: How to steal models and dataContinue

Backdoor attacks & defense @ CVPR ’23: How to build and burn Trojan horses
CVPR '23 | Science

Backdoor attacks & defense @ CVPR ’23: How to build and burn Trojan horses

ByJan Zahálka 20 September 202320 September 2023

Backdoor (or Trojan) attacks poison an AI model during training, essentially giving attackers the keys. This post summarizes CVPR ’23 research backdoor attacks and defense.

Read More Backdoor attacks & defense @ CVPR ’23: How to build and burn Trojan horsesContinue

From “maybe” to “absolutely sure”: Certifiable security at CVPR ’23
CVPR '23 | Science

From “maybe” to “absolutely sure”: Certifiable security at CVPR ’23

ByJan Zahálka 13 September 2023

Certifiable security (CS) gives security guarantees to AI models, which is highly desirable for practical AI applications. Learn about CS work at CVPR ’23 in this post.

Read More From “maybe” to “absolutely sure”: Certifiable security at CVPR ’23Continue

Page navigation

1 2 3 Next PageNext

Recent Posts

  • What’s more powerful than one adversarial attack?
  • Can ChatGPT read who you are?
  • Elves explain how to understand adversarial attacks
  • A cyberattacker’s little helper: Jailbreaking LLM security
  • Judging LLM security: How to make sure large language models are helping us?

Archives

  • February 2024
  • January 2024
  • December 2023
  • November 2023
  • October 2023
  • September 2023
  • August 2023
  • July 2023
  • June 2023
  • May 2023

Categories

  • AI
  • CVPR '22
  • CVPR '23
  • Science
  • Security

© 2025 Jan Zahálka | Privacy policy

We are using cookies to give you the best experience on our website.

You can find out more about which cookies we are using or switch them off in .

  • Blog
  • Subscribe
  • My work
    • Publications
    • Teaching
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.

Strictly Necessary Cookies

Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.

3rd Party Cookies

This website uses Google Analytics to collect anonymous information such as the number of visitors to the site, and the most popular pages.

Keeping this cookie enabled helps us to improve our website.

Please enable Strictly Necessary Cookies first so that we can save your preferences!

Powered by  GDPR Cookie Compliance