Cyberweekly #225 - Managing your use of AI
What does it actually mean to use AI in your organisation or life?
You might want to use AI tools because they are cool and let you speed up some of your work, or you might seriously be considering how to best embed large language models into your analytical tool chain.
But understanding AI in the current climate is incredibly hard not only because is it moving fast, but because the community is still struggling with defining and agreeing the language that we use. Is a large language model actually AI or is it simply a stochastic parrot that can bark out words without understanding? What will it mean for AGI to be created and what implications will it have for our daily work? How similar or different are the ethical implications of training machine language models on art, on literature or on your own organisations documents?
Without the well defined language to say what AI is and can achieve, it will remain a form of occult knowledge, known only to a few people, and otherwise spoken about in whispers, rumour and conjecture. After all, any technology that is sufficiently advanced is indistinguishable from magic! While many of our AI specialists try to explain their working, each is using their own mental model of AI, and their own subsequent language to express how it works. This gives rise to a holy priesthood of AI whisperers who claim to know the inner workings of the AI mind, and how to bend it to their will.
This all means that it's incredibly hard as executives and leaders to define what appropriate and acceptable use of AI within an organisation might look like.
As technology vendors embed AI tooling into their existing technology offerings, it's going to become increasingly difficult to understand what risks your organisation is taking, where your data is going, and just whether you are enabling the right kinds of behaviours in your staff. Will your users give up your organisations data to an ever increasing number of AI startups because they provide useful tools, and if you ban or prohibit such action, will you find yourself left behind your competitors who can decide and execute faster than you can?
That isn't to be a doomsayer about it, I think AI has many magnificent capabilities and I think it's use in things like textual analysis, automated tooling and decision support is going to massively help us. But it's also going to create hidden dangers that we're not entirely aware of, and so we'll need to consider that carefully as we adopt, willingly or unwillingly, AI into our toolchains.
In collaboration with the private and public sectors, NIST has developed a framework to better manage risks to individuals, organizations, and society associated with artificial intelligence (AI). The NIST AI Risk Management Framework (AI RMF) is intended for voluntary use and to improve the ability to incorporate trustworthiness considerations into the design, development, use, and evaluation of AI products, services, and systems.
Released on January 26, 2023, the Framework was developed through a consensus-driven, open, transparent, and collaborative process that included a Request for Information, several draft versions for public comments, multiple workshops, and other opportunities to provide input. It is intended to build on, align with, and support AI risk management efforts by others.
A companion NIST AI RMF Playbook also has been published by NIST along with an AI RMF Roadmap, AI RMF Crosswalk, and various Perspectives. In addition, NIST is making available a video explainer about the AI RMF.
This AI framework focuses on a core central governance structure, essentially defining what we mean by AI and what we think the risks and controls are. It then expands out into 3 core activities, Map, MEasure and Manage, which organisations can use to map where AI implementations might create or exacerbate risks for them, measure the implications of those risks to determine whether they will manifest and of course, manage the risks down to acceptable levels.
While definitely not perfect, it's one of the better frameworks for organisations considering making use of AI technologies in their core business to understand and develop their oversight of them
The biggest problem I see with people when they think about AI is that they secretly—even to themselves—define AI as whatever a human does.
But when they try to give a definition, it doesn’t include humanity.
Then when someone gives an example of something intelligent that’s not human, that meets their criteria, they say, “That’s not intelligence.”
I like Daniel's definition of AGI, and note his warning here that we often misclassify or misdefine what we mean when we talk about AI.
This results in two people using the same words to talk about AI but meaning different things.
My preferred format for publishing these documents is as an annotated presentation —a single document (no clicking “next” dozens of times) combining key slides from the talk with custom written text to accompany each one, plus additional links and resources.
Here’s my most recent example: Catching up on the weird world of LLMs , from North Bay Python last week.
I don’t tend to write a detailed script for my talks in advance. If I did, I might use that as a starting point, but I usually prepare the outline of the talk and then give it off-the-cuff on the day. I find this fits my style (best described as “enthusiastic rambling”) better.
Instead, I’ll assemble notes for each slide from re-watching the video after it has been released.
I don’t just cover the things I said in the the talk—I’ll also add additional context, and links to related resources. The annotated presentation isn’t just for people who didn’t watch the talk, it’s aimed at providing extra context for people who did watch it as well.
I thought this was a really useful set of tips, as much like Simon (and probably inspired by the time I spent working with him), I don’t write scripts for my talks in advance, but follow the “enthusiastic rambling” presentation style.
I really like how Simon turns his talks into web pages that are accessible and easy to read, and the tweaking of his process here to include AI tools to save him time and effort is a lovely indication of how a technologist can face some problems when they have a good toolbox of tools to hand.
This article is part of an ongoing experiment with the use of ChatGPT for developing primers on cyber security topics. Krypt3ia generated this text with ChatGPT, OpenAI’s large-scale language-generation model. This author reviewed, edited, and revised the language to my own liking and takes ultimate responsibility for the content of this publication .
I’ve enjoyed this series as a set of AI generated analytical products, trying to demonstrate the capaiblities of using AI as part of the analytical cycle.
I’m not sure whether it’s really that significant, given that Krypt3ia hasn’t really explored how much time the tweaking, modification or rewriting and verification took on this, but as a thought provoking test product to show what’s possible, I think it might indicate a direction in which organisations threat intelligence teams could go.
The biggest risk or concern I have with such products is that doing this sort of background research and development of background primers is often given to junior members of the team. But that’s not just because it’s “easy work” and should be outsourced to an AI, but because it’s a really good way to develop your research, analytical and writing skills.
If we recognise that a senior analyst would have to read, verify and potentially rewrite this product to be issued regardless of whether written by a junior analyst or AI, then at what point in their career history are they developing those skills in order to become a senior analyst?
If you want to use this kind of thing, then we need a new set of activities for junior analysts that enable them to do and develop these skills in order to provide effective oversight of the analytical pipeline.
During the past months, discussion about generative AI have been abundant. You can stumble upon the topic in news, webinars, blogs and, of course, on darkweb forums. Building a malicious/evil/obnoxious GPT and marketing it on a dark forum seems to be the trend this summer.
It can look scary, but mind that generative AI and machine learning are just tools. As with any other tool, people use them to do good or to harm. Cybersecurity specialists have been using AI modules in various ways to improve threat hunting and response capabilities. So, they are at least one step ahead threat actors in this field.
I thought this was interesting, if slightly overwrought.
Much like the rest of the tech industry, the darknet and criminal ecosystem is filled with people overstating the capabilities of their systems and their tools.
Anyone claiming to be able to create “undetectable malware” shouldn’t be trusted even if they are using the latest in AI tools to help do it, and I’d be deeply suspicious of claims such as that.
What is expected is that the social side of fraud systems might get more efficient. The criminal community that is willing to write lies in order to steal money or do damage isn’t actually that large, and with that, skill in crafting empathetic and well written phishing lures and fraudulant messages is therefore not in mass supply.
The use of AI systems, that don’t really have a moral filter, means that the quality of those social aspects of the fraud system are going to get better, clearer and cheaper for attackers to use.
But it soon became clear that although the attackers had infected thousands of servers, they had dug deep into only a tiny subset of those networks—about 100. The main goal appeared to be espionage.
The hackers handled their targets carefully. Once the Sunburst backdoor infected a victim’s Orion server, it remained inactive for 12 to 14 days to evade detection. Only then did it begin sending information about an infected system to the attackers’ command server. If the hackers decided the infected victim wasn’t of interest, they could disable Sunburst and move on. But if they liked what they saw, they installed a second backdoor, which came to be known as Teardrop. From then on, they used Teardrop instead of Sunburst. The breach of SolarWinds’ software was precious to the hackers—the technique they had employed to embed their backdoor in the code was unique, and they might have wanted to use it again in the future. But the more they used Sunburst, the more they risked exposing how they had compromised SolarWinds.
Through Teardrop, the hackers stole account credentials to get access to more sensitive systems and email. Many of the 100 victims that got Teardrop were technology companies—places such as Mimecast, a cloud-based service for securing email systems, or the antivirus firm Malwarebytes. Others were government agencies, defense contractors, and think tanks working on national security issues. The intruders even accessed Microsoft’s source code, though the company says they didn’t alter it.
Thoroughly researched and a really interesting read into the Solarwinds hack from a few years back.
What comes through to me is just how lucky the industry was at the point at which the attack was spotted. This could have been much worse had it not been unpicked when it was
These documents appeared to be dated to early March, around the time they were first posted online on Discord, a messaging platform popular with gamers.
However, Bellingcat has seen evidence that some documents dated to January could have been posted online even earlier, although it is unclear exactly when. Bellingcat also spoke to three members of the Discord community where the images had been posted who claimed that many more documents had been shared across other Discord servers in recent months.
Bizarrely, the Discord channels in which the documents dated from March were posted focused on the Minecraft computer game and fandom for a Filipino YouTube celebrity. They then spread to other sites such as the imageboard 4Chan before appearing on Telegram, Twitter and then major media publishers around the world in recent days.
This story from last April was far more interesting in the intent and methodology of the leaker than anything else.
Ego has always been something that drives espionage collection, attempting to get someone to provide secrets by playing to their ego is a tactic taught to spies around the world for hundreds of years.
But in this case, the ability to build a community of people from around the world who want to listen to you, who look up to you, and to whom you can whisper secrets you are aware of is a change in the fabric of society that is still impacting us today.
The work of people tracking insider threat has always focused on social cohesion. Who do you spend time with outside of work? What are your interests? But traditional personnel security and vetting systems are really focused on the people you come into physical contact with.
The internet has massively widened the net of people that you can interact with, whether you have hundreds or thousands of Favebook friends or Twitter followers, or just online dating profiles. This change, the Social or Web 2.0 revoution on the internet has mostly changed how people interact weakly with their social circle, increasing the number of people we can keep aware of our life, but hasn’t significantly changed the way we bond and form deep freindships. But over the last few years, the rise of generations of people who interact primarily online, along with better tools for engaging in real time, from Whatsapp groups to discord channels, means that it’s far easier to build strong social ties to people you don’t really know.
All of which is to say that this sort of event was probably inevitable, but that our internal concepts about how insiders think, work and carry out their insidious tasks needs to ensure it’s keeping up with social changes
In his post, Thomas quite rightly points out that there is no actual supplier agreement inherent in open source software dependencies. In that sense, a developer of OSS is not a supplier to upstream projects. The use of the “supply chain” metaphor can cause misunderstandings — as when open source software project maintainers receive legal notices or letters from companies. Indeed, this very thing happened during Log4j. Open source developer Daniel Stenberg (famous for his foundational work on the Curl project) received letters from a (well meaning) corporate procurement team regarding the Curl project, which treated Daniel as a supplier, compelling him to provide data about use of Log4j in Curl within a strict timeline. While this kind of communication might be appropriate in a strict supplier relationship, which is codified in a contract with a service level agreement. It’s not appropriate — at all — in an open source context.
I think the nomenclature and mental model around software supply chain is still a very useful construct, especially when communicating with policy makers or other decision makers who are unfamiliar with software dependencies or how open source works. However, this metaphor only goes so far. When we use it, we should also be cognizant of the fact that open source dependencies are not a supplier agreement, and more importantly, there is an entire community of open source software developers out there who do not view themselves in this way. The software security complex needs to be a big tent — inclusive of those people and communities who may not be as comfortable with the “supply chain” way of thinking — if we want to achieve the goal of leveling up software security across the industry.
Great points from Dan Applequist here. This isn’t to say that the models we have for managing our supply chain won’t work, indeed things like SBOM are excellent within an open source context, but that it’s important to be cogniscent that there is a different supply chain relationship with an actual software supplier than with open source software.
I think the difference for me is the active user moves slightly in open source software. In most realtionships, you can have either an entirely active partner and a passive partner, or you can have a more equal relationship. If you are buying shrinkwrap software off of the shelf, then you almost entirely active, and the supplier is simply providing some software without a relationship. But many supplier relationships have more active “suppliers”, people with sales teams, consulting offers and partner success employees who will engage with you. But for most open source, we’re back to the shrink wrap model. As a writer of open source, I have no idea, understanding or involvement in your determination to use my software, and as such I have very few responsibilities to you. All of the power lies in your hands as the active partner.
It took only a few seconds to uncover the target’s entire life.
On the messaging app Telegram, I entered a tiny amount of information about my target into the dark blue text box—their name and the state I believed they lived in—and pressed enter. A short while later, the bot spat out a file containing every address that person had ever lived at in the U.S., all the way back to their college dorm more than a decade earlier. The file included the names and birth years of their relatives. It listed the target’s mobile phone numbers and provider, as well as personal email addresses. Finally, the file contained information from their drivers’ license, including its unique identification number. All of that data cost $15 in Bitcoin. The bot sometimes offers the Social Security number too for $20.
This is the result of a secret weapon criminals are selling access to online that appears to tap into an especially powerful set of data: the target’s credit header. This is personal information that the credit bureaus Experian, Equifax, and TransUnion have on most adults in America via their credit cards. Through a complex web of agreements and purchases, that data trickles down from the credit bureaus to other companies who offer it to debt collectors, insurance companies, and law enforcement.
The US is increasingly legislating around data protection, including the recent Californian Consumer Privacy Act. I think that the slightly shadowy world of data brokers is increasingly becoming apparent to more and more people.
One of the problems in this space is getting people to care. Survey after survey finds that most normal people essentially shrug. They know their data is out there, hoovered up by advertisers and bundled and resold, but because it generally has so little impact on their personal life, it’s difficult for anyone to work out what they would do about it.
This is a space where industry codes of practice and government regulation is likely the only way to get meaningful difference, because the harm happens in bulk against citizens rather than any specific individuals.
In the last week of August, Truesec Cybersecurity Incident Response Team (CSIRT) investigated a Microsoft Teams malware campaign delivering malware identified as DarkGate Loader.
On August 29, in the timespan from 11:25 to 12:25 UTC, Microsoft Teams chat messages were sent from two external Office 365 accounts compromised prior to the campaign. The message content aimed to social engineer the recipients into downloading and opening a malicious file hosted remotely.
This shouldn’t be much of a surprise, but as border scanning of mail gateways and mail scanning in Outlook and Gmail gets so much better, attackers will seek to use systems that don’t have as much monitoring and protection on.
Allowing external messages in Teams is always going to be tricky. It’s incredibly useful to be able to message people in other companies, especially where you might have a commercial partnership, a strategic partnership or even just a buyer/supplier relationship.
However, because IM systems lack the same scanning, attackers will attempt to abuse that system.
Secondly, our trust mechanisms within Teams specifically, and many other IM tools, rely on trusting specific internet domains. But even in a medium sized organisation, due to org changes, digital teams with their own budgets and subcontracting arrangements, you might have 10s of domains that people who work for you might use at some point. This makes it hard to know which domains to trust and incredibly hard to maintain that list over time.
Thanks for reading