Pentestify wins the Merge Madrid Startup Battle 2024

Andrew Law pitches for Pentestify at Merge Madrid

Pentestify wins the Merge Madrid Startup Battle 2024

01/09/2023

Andrew Law pitches for Pentestify at Merge Madrid

“Merge Madrid also spotlighted emerging talent with the Startup Competition sponsored by Tritemius, which brought together 19 finalists from verticals such as real estate tokenization, carbon credit tokenization, community management, and WEB3 cybersecurity, among others. After a fiercely contested battle, Pentestify was crowned the winner for its innovative digital security and audit solution. This competition underscored Merge’s commitment to supporting startups driving change and disruption in the WEB3 ecosystem.”

Read the full article on Cointelegraph

Pentestify shortlisted for UK Innovative StartUp Award 2024

Pentestify shortlisted for UK Innovative StartUp Award 2024

01/04/2024

Pentestify has been named a finalist ahead of the UK StartUp Awards for most innovative startups category !

The UK StartUp Awards was launched to recognise the booming start-up scene across the UK which has accelerated over the last few years with over 900,000 new businesses founded in the UK in 2023, a 12% increase from the year before. Over 900 businesses have been shortlisted for this year’s UK StartUp Awards and the contribution of these firms – all of which were started in the last three years – is significant, having created nearly 6,000 new jobs since they were established and generating annual sales of £480 million.

Pentestify is an AI smart contract security company founded by Andrew Law and Lucas Martin Calderon  in 2022 and has been nominated for the Innovative StartUp of the Year Award in the south west of the UK. 

Supported nationally by ScoreApp, GS1 UK, Starling Bank, OVHcloud, GiftRound, Airwallex, the programme will celebrate the achievements of the amazing individuals who have turned an idea into an opportunity and taken the risk to launch a new product or service.

The cohort of finalists will be considered for the regional prizes by a panel of seasoned judges with experience founding or supporting entrepreneurial ventures. The winners from each region will then be invited to the first UK final taking place at Ideas Fest, the Glastonbury for Business festival in Tring, Hertfordshire on 12th September 2024. Professor Dylan Jones-Evans OBE, the creator of the UK StartUp Awards, said: “Start-up businesses are the lifeblood of any economy, being responsible for new jobs,
innovation and wealth creation across the UK. All the finalists this year represent the best of those entrepreneurs who have spotted an opportunity and through their sheer hard work, talent and perseverance, have created amazing new businesses that are creating real impact in their sectors. Building on the success of previous years, they are now looking to identify the ‘best of the best’ with all the finalists who win their category in their region going on to represent their region at the first ever UK National StartUp Awards final later this year”. 

The UK StartUp Awards was created in collaboration with the team behind the Great British Entrepreneur Awards, one of the most successful awards programmes in the UK. The UK StartUp Awards are running for the third year after launching in 2022.

This year’s regional UK StartUp Awards finalists can be found online at https://startupawards.uk/

StartUp Awards

The UK StartUp Awards is a collaboration between the founders of the Great British Entrepreneur Awards; an established programme nationally receiving thousands applications annually.

The UK StartUp Awards will recognise the achievements of those amazing individuals who have had a great idea, spotted the opportunity and taken the risks to launch a new product or service. After extending across the whole of the United Kingdom last year, the UK StartUp Awards will now host an awards for the winners of each 10 UK regions at a national final later in the year.

Breaking down NIST’s Internal Report IR 8472 Non-Fungible Token Security

<span data-metadata=""><span data-buffer="">Breaking down NIST’s Internal Report IR 8472 Non-Fungible Token Security

18/03/2024

NIST has published the final version of Internal Report (IR) 8472, Non-Fungible Token Security. We’ve broken down the key takeaways from NIST’s Internal Report 👇

Background

Blockchain

Blockchains are digital ledgers that are tamper-evident and resistant, implemented without a central repository, often operating without a central authority. They consist of cryptographically signed transactions grouped into blocks, which are cryptographically linked to previous ones, ensuring data integrity and resistance to modification.

Smart Contracts

Smart contracts are collections of code and data deployed on the blockchain network, facilitating automated, secure transactions and state management without intermediaries. They are executed by nodes within the blockchain network, ensuring consistent results across the network.

Tokens

Tokens are digital representations of assets, managed by smart contracts on a blockchain. They can be fungible, with identical tokens being interchangeable, or non-fungible (NFTs), where each token is unique and represents a distinct asset or property.

NFT Definition

A Non-Fungible Token (NFT) is an owned, transferable, indivisible data record on a blockchain, representing a digital or physical asset. Unlike fungible tokens, NFTs are unique, with each token linked to a specific asset, managed by a smart contract.

The 11 Properties of NFTs

NFT properties derive from their definition and are provided by smart contracts, the underlying blockchain, and human management.

  1. Owned: Ownership is designated by recording a blockchain address within the NFT.
  2. Transferable: NFTs can be transferred between owners or approved entities.
  3. Indivisible: NFTs cannot be subdivided, maintaining their uniqueness.
  4. Linked: Each NFT is linked to a specific asset it represents.
  5. Recorded: NFT transactions and data are recorded on the blockchain.
  6. Provenance: The history of NFT ownership is traceable via the blockchain.
  7. Permanence: NFTs are designed to be indestructible on the blockchain.

  8. Immutable: The asset an NFT represents cannot be modified.
  9. Unique: Each NFT is unique, representing a specific asset.
  10. Authentic: The authenticity of the asset is claimed by the NFT.
  11. Authorized: The asset’s sale as an NFT has been authorized by its owner.

Strengthening DeFi Against Insolvency Risks

1. Ownership Confusion: Buyers might believe they’re purchasing an asset, not an NFT.
2. Unauthorized NFT Creation: Smart contracts may link NFTs to assets without legal authority.
3. Account Compromise: Theft of blockchain account keys can result in NFT theft.
4. Immediate Sale of Stolen NFTs: Thieves quickly sell stolen NFTs for cryptocurrency.
5. Lack of Restoration Mechanisms: Stolen tokens often cannot be restored.
6. Potential Confiscation by Contract Managers: Managers could misuse their power to transfer tokens.
7. Future Manager Privileges: Updates to smart contracts could grant new powers to managers, including transferring tokens.
8. Smart Contract Vulnerabilities: Coding errors could allow token theft.
9. Fractional Ownership Risks: Additional smart contracts for fractional ownership increase attack surfaces.
10. Forced Buyout Unawareness: Fractional owners may not realize they can lose shares through forced buyouts.
11. Delinking Risk: Incorrect metadata can render an NFT worthless.
12. Server Failure: External data hosting failures can delink NFTs.
13. Compromise of Off-Blockchain Link Tables: Attackers can alter NFT linkages.
14. Owner-Initiated Delinking: Table owners can intentionally delink NFTs.
15. Public Information Unawareness: NFT ownership data is public.
16. De-Anonymization Risks: Blockchain accounts can be traced back to individuals.
17. Blockchain History Alteration: Attacks could modify blockchain history.
18. NFT Burning: Sending NFTs to inaccessible addresses effectively destroys them.
19. Self-Destructing Smart Contracts: Contracts could be programmed to destroy themselves.
20. Data Record Alteration: Vulnerabilities might allow changes to NFT data.
21. Immutability Exceptions: Consensus or forks can alter blockchains.
22. Chain Splits: Forks can duplicate NFTs across blockchains.
23. Non-Unique Asset Linking: Multiple NFTs can link to the same asset.
24. Simultaneous Sales on Multiple Exchanges: The same asset can be sold as different NFTs.
25. Forged or Misattributed Assets: NFTs might misrepresent asset authenticity.
26. Unauthorized Sales: Sellers might not have the right to sell the NFT.
27 Misunderstood Purchase Rights: Buyers might not receive the expected rights over the asset.

Marketplaces and Exchanges

NFT marketplaces facilitate the trading, creation, and sale of NFTs, offering various buying mechanisms and requiring attention to wallet security. The choice between decentralized and centralized custody models influences risk and user responsibility.

Conclusion

NFTs provide a secure method for representing and transferring ownership of unique digital and physical assets. However, their implementation and ecosystem are subject to a range of security vulnerabilities. Addressing these concerns through systematic security approaches is crucial for maintaining the integrity and trustworthiness of NFT technology.

DeFi: Full Guide [Part I]

DeFi: Full Guide [Part I]

13/02/2024

In this long article, you will learn the core concepts of what make DeFi tick, how it differs from traditional finance and the main use-cases and commercial applications DeFi has found in our world today.

Topics covered


1. Lending and borrowing

  • Deep dive into Aave

    2. Liquidations

  • Overcollateralization and bad debt
  • Thresholds
  • Account liquidity
  • Health factor
  • Insolvent position analysis
  • Generalization

    3. Rewards

1. Lending and Borrowing

As opposed to centralized liquidity locations in traditional finance, DeFi crowd-funds the liquidity pools from regular and professional users, giving the user a much bigger leeway when it comes to paying back the borrowed amount, upon overcollateralization.

The predominantly collateralized loan structure offered by most DeFi lending protocols uncovers an intriguing strategy for trading: leveraging.

Assume, for instance, that you have a strong positive outlook on the price of Ethereum (ETH) — absolutely confident that it’s on an upward trajectory. You might deposit a certain amount of ETH (let’s say, worth $1000) into a DeFi lending protocol of your choice. Then, you could leverage this deposit to borrow a stablecoin, like DAI, and subsequently use that to acquire more ETH. For this illustration, consider that you buy an additional $500 worth of ETH on an exchange, exposing your ETH exposure to $1500 from your initial $1000 investment.

However, the strategy can be expanded further. What if you used the additional $500 worth of ETH you just bought as collateral to borrow more DAI? This method, referred to as over-leveraging, can be repeated until the protocol’s policies restrict you, typically when your borrowing limit is exceeded.

On the flip side, suppose your perspective on ETH is less than optimistic. In such a scenario, you could deposit DAI as collateral to borrow ETH, which you would immediately swap for more DAI. If your forecast holds and ETH prices fall, you could repurchase the borrowed amount for less on an exchange, repay your loan, and pocket the remaining DAI. This effectively opens (and subsequently closes) a short position on ETH, thereby capitalizing on its declining market value.

Dangers?

  • The Lido protocol could suffer a breach or its validators could get slashed, causing you to lose a part of your staked ETH permanently or even all of it.
  • If the price of stETH goes down while you are borrowing ETH against it, you might overpass the minimum collateral threshold, forcing you to liquidate all your stETH and keep the over-collaterialized ETH, and hence losing 30%+ for simple market movements out of your control.

2. Share Tokens

In the world of Decentralized Finance (DeFi), akin to traditional finance (TradFi), depositors are encouraged to keep their assets in lending pools for longer durations through the incentive of accruing interest over time. The interest gained is a calculated percentage of a user’s deposit, as determined by the protocol, and can be claimed by the user who deposited. As the duration of asset retention in the lending pool increases, so does the interest accumulated.

The question that arises is how does the protocol maintain a record of each user’s proportionate share in the pool? It’s crucial to understand that as a user deposits assets into the pool, the shares of all existing users are diluted, and this change is factored in by the protocol. Yet, the protocol does not continually monitor and adjust each user’s share directly with every deposit or withdrawal transaction. Implementing such a system on-chain would not only be enormously inefficient but also excessively costly for depositors, who would bear the expenses for each update operation.

Instead, the protocol is designed to only manage the change in the share of the depositor, without needing to actively adjust the shares of other users.

It might appear at first glance that this protocol system allows users to enjoy all the benefits without any trade-offs, but it’s not entirely so:

Protocols manage the process of interest distribution by creating (minting) and destroying (burning) ERC20 tokens, which we’ll call “Stake Tokens” for this discussion. These Stake Tokens represent the percentage of assets a lender has deposited in the lending pool. This Stake Token mechanism automatically compensates for the dilution of other stakeholders’ shares when Stake Tokens are minted or burned, correlating with the deposit or withdrawal of the underlying assets.

Let’s review an easy example:

Let’s create a simplified visual example of how this system might work using the concept of “Stake Tokens” or “Share Tokens”. We’ll be using hypothetical figures for the sake of easy understanding.

Let’s imagine there’s a lending pool called “Pool A”, and currently, there are only two users who have deposited assets into it, Alice and Bob. Alice has deposited 500 DAI, and Bob has deposited 500 DAI as well. So, the total value of the pool is 1000 DAI.

In return, the protocol has minted 1000 Stake Tokens, with Alice and Bob each holding 500 Stake Tokens, representing their respective shares in the pool.

Now, suppose Charlie comes along and deposits 500 DAI into Pool A. The total pool is now 1500 DAI. However, the protocol doesn’t immediately mint more Stake Tokens. Charlie’s deposit dilutes the shares of Alice and Bob, but their Stake Tokens remain the same.

However, when Charlie wants to withdraw his funds or earn interest, the protocol mints Stake Tokens for Chrlie. Suppose the protocol mints 500 Stake Tokens for Charlie, the total Stake Tokens in the pool will now be 1500.

The Stake Tokens are an efficient way for the protocol to keep track of each depositor’s share in the pool without having to continually adjust everyone’s shares with each deposit or withdrawal. Each Stake Token can be thought of as a claim on the underlying assets in the pool, and their value adjusts automatically as more assets are deposited or withdrawn.

(Please note that this is a simplified example and actual DeFi protocols may operate with more complexity and additional safety measures to protect the lenders and borrowers).

🤿 Now, let’s dive deep into Aave’s Share Token:

First, Aave is a decentralized non-custodial liquidity markets protocol where users can participate as suppliers or borrowers. Suppliers provide liquidity to the market to earn a passive income, while borrowers are able to borrow in an overcollateralized (perpetually) or undercollateralized (one-block liquidity) fashion.

Alright, let’s break this down by first understanding the individual concepts and then tying them together:

1. Decentralized: This term refers to the lack of a central authority or entity controlling a system. Instead, control is distributed among various participants in the system. In the case of blockchain technology, this typically involves a network of computers (known as nodes) that maintain a shared ledger of transactions (the blockchain).

2. Non-custodial: This is a type of financial service where the service provider does not hold or control user assets. Instead, the assets are under the control of the user, typically secured by cryptographic methods. The user usually controls the private key that allows access to the assets.

3. Liquidity markets protocol: This is a fancy term for a set of rules that govern a market where assets can be quickly bought or sold without causing significant changes in their prices. In the context of DeFi, a liquidity market protocol would be the rules and mechanisms that allow users to lend or borrow assets.

4. Suppliers and Borrowers: These are the participants in the market. Suppliers (also known as lenders) provide assets to the market, while borrowers take assets from the market. The terms are borrowed from traditional finance but are used in the same way in the DeFi space.

5. Liquidity: In the context of DeFi, liquidity typically refers to the availability of assets in a market. If a market has a lot of assets (like a cryptocurrency) available to buy or sell, it has high liquidity.

6. Passive income: This is income earned without active involvement. In this context, suppliers earn passive income by providing their assets to the market, which are then borrowed by other users. They earn interest on these assets as a form of passive income.

7. Overcollateralized and Undercollateralized: Collateral is an asset that a borrower offers as a guarantee to a lender. If the borrower fails to pay back the loan, the lender can seize the collateral. Overcollateralized means the value of the collateral is greater than the value of the loan. Undercollateralized means the value of the collateral is less than the value of the loan.

8. One-block liquidity: This term is unique to blockchain-based systems. A “block” refers to a group of transactions that are processed together on a blockchain. In this context, “one-block liquidity” refers to the ability of the Aave protocol to process a loan in a single block of transactions.

Now, let’s have a look at its smart contracts (directly from GitHub), responsible of handling when users deposit liquidity into the protocol. Please note that I have extracted the relevant code snippets, to make it faster for you to read:

The source code can be found here: https://github.com/aave/aave-v3-core/tree/master

The borrowing process in Aave is divided into these main parts:

  1. Depositing: When a user deposits tokens into a lending pool, aTokens are minted and transferred to the depositor. The amount of aTokens received is proportional to the amount of the underlying asset that was deposited.
  2. Borrowing: When a user borrows tokens, they must specify the interest rate mode they want to use (either stable or variable), the amount they want to borrow, and the asset they want to borrow.

Here is the code snippet for the borrowing function:

				
					function executeBorrow(
  mapping(address => DataTypes.ReserveData) storage reservesData,
  mapping(uint256 => address) storage reservesList,
  mapping(uint8 => DataTypes.EModeCategory) storage eModeCategories,
  DataTypes.UserConfigurationMap storage userConfig,
  DataTypes.ExecuteBorrowParams memory params
) public {
  // ...
  if (params.interestRateMode == DataTypes.InterestRateMode.STABLE) {
    (
      isFirstBorrowing,
      reserveCache.nextTotalStableDebt,
      reserveCache.nextAvgStableBorrowRate
    ) = IStableDebtToken(reserveCache.stableDebtTokenAddress).mint(
      params.user,
      params.onBehalfOf,
      params.amount,
      currentStableRate
    );
  } else {
    (isFirstBorrowing, reserveCache.nextScaledVariableDebt) = IVariableDebtToken(
      reserveCache.variableDebtTokenAddress
    ).mint(params.user, params.onBehalfOf, params.amount, reserveCache.nextVariableBorrowIndex);
  }
  // ...
}
				
			

The borrowing action is executed in the executeBorrow() function. It checks if the user is eligible for a borrow operation with ValidationLogic.validateBorrow(), updates the state of the reserve with reserve.updateState(reserveCache), and mints the corresponding debt tokens.

  1. Minting Debt Tokens: When tokens are borrowed, corresponding debt tokens are minted. There are two types of debt tokens: stable and variable, depending on the interest rate mode the borrower chose. When a user borrows tokens, the corresponding debt tokens are minted in the executeBorrow() function. The type of the debt tokens (stable or variable) depends on the chosen interest rate mode.
				
					
function executeRepay(
  mapping(address => DataTypes.ReserveData) storage reservesData,
  mapping(uint256 => address) storage reservesList,
  DataTypes.UserConfigurationMap storage userConfig,
  DataTypes.ExecuteRepayParams memory params
) external returns (uint256) {
  // ...
  if (params.interestRateMode == DataTypes.InterestRateMode.STABLE) {
    (reserveCache.nextTotalStableDebt, reserveCache.nextAvgStableBorrowRate) = IStableDebtToken(
      reserveCache.stableDebtTokenAddress
    ).burn(params.onBehalfOf, paybackAmount);
  } else {
    reserveCache.nextScaledVariableDebt = IVariableDebtToken(
      reserveCache.variableDebtTokenAddress
    ).burn(params.onBehalfOf, paybackAmount, reserveCache.nextVariableBorrowIndex);
  }
  // ...
}

				
			

2. Interest Rate Swap: Borrowers can switch between the stable and variable interest rates at any time.

Here is the code below:

				
					
function executeRepay(
  mapping(address => DataTypes.ReserveData) storage reservesData,
  mapping(uint256 => address) storage reservesList,
  DataTypes.UserConfigurationMap storage userConfig,
  DataTypes.ExecuteRepayParams memory params
) external returns (uint256) {
  // ...
  if (params.interestRateMode == DataTypes.InterestRateMode.STABLE) {
    (reserveCache.nextTotalStableDebt, reserveCache.nextAvgStableBorrowRate) = IStableDebtToken(
      reserveCache.stableDebtTokenAddress
    ).burn(params.onBehalfOf, paybackAmount);
  } else {
    reserveCache.nextScaledVariableDebt = IVariableDebtToken(
      reserveCache.variableDebtTokenAddress
    ).burn(params.onBehalfOf, paybackAmount, reserveCache.nextVariableBorrowIndex);
  }
  // ...
}

				
			

The interest rate mode can be swapped in the executeSwapBorrowRateMode() function. It validates the swap with ValidationLogic.validateSwapRateMode(), updates the story ate of the reserve, and burns the current debt tokens and mints new ones in the chosen interest rate mode.

Interested in learning about liquidations and rewardsContinue reading Part II of this series! 🙂

Building AI Security Tooling and Web3 Security

Lucas Martin Calderon on Building AI Security Tooling for Web2 and Web3

31/10/2023

Ask me anything with Pentestify CEO: 0-1 journey of a smart contract auditor

Ask me anything with Pentestify CEO: 0-1 journey of a smart contract auditor

12/10/2023

Are you starting your journey as a smart contract auditor ? This is a unique opportunity to learn from the experience of Lucas Martin Calderon, co-founder of Pentestify. 

Tune in on October 13th at 2.30pm BST: https://twitter.com/i/spaces/1RDGllegENMGL

Don’t know what to expect ? Here is a sequence from his last interview with DeFi talents: 

Blockchain, Cybersecurity and AI have been the 3 pillars upon which all my creations have been built. From programming the Bitcoin environment and mining coins using my own automatic software to becoming an Ethereum contributor and DAO developer, I have come a long way since 2016 and now advise tier 1 banks, DeFi protocols and SANS students in blockchain and smart contract security”

“My professional journey began as a software engineer at Deutsche Bank, in the heart of London’s financial district. Though it was a role many would covet, my entrepreneurial spirit led me to accept a unique opportunity at STATION F, the world’s largest startup incubator in Paris, as my team and I got accepted. This choice marked the start of my journey into the transformative intersection of artificial intelligence (AI) and smart contract security in ways that I could have not foreseen. This was equally around the time where I got awarded SuperNova Under 30, by Nova Talent, in the realm of cybersecurity and computer science in Spain.

“Born from this pursuit was Pentestify, my brainchild and current primary focus. As the founder and CEO, I’ve dedicated myself to creating a pioneering platform that automates the detection of smart contract security vulnerabilities using cutting-edge AI models. With esteemed partners such as the University College London and global tier-1 banks, Pentestify has rapidly evolved into an industry-leading SaaS solution in the blockchain security space.

My international upbringing and education, spanning France, Germany, Spain, and the UK, have indubitably shaped my unique perspective on technology. I’ve always championed a growth mindset, which has been instrumental in my public learning approach, fueling my constant desire to broaden my horizons and master new skills.

As my career progressed, I found myself at the intersection of various ventures, from co-founding Web3Sec, a globally recognized blockchain security news aggregator newsletter, to launching zkToro, a DeFi protocol designed to enhance social trading while preserving user privacy, which I believe will undoubtedly pioneer the field of privacy-enhancing quant trading in the upcoming years”

Pentestify joins the ranks of Europe’s top 50 blockchain startups – European blockchain convention

Pentestify joins the ranks of Europe's top 50 blockchain startups - European blockchain convention

01/10/2023

The European Blockchain Convention gathers 5,000+ crypto and blockchain enthusiasts, 300+ speakers from leading blockchain companies, and 200+ exhibitors for 2 days in Barcelona. 

Each year, the European blockchain convention organises the EBC Startup Battle, which is the largest early-stage blockchain startup competition in Europe.  In each edition, 50 finalist startups are carefully selected from a vast pool of applications to pitch their ideas in front of everyone. 

As one of the finalists, Pentestify will be pitching and get a chance to win the €35,000 award granted to the winner of the competition. Beyond the competition, we truly look forward to meeting the other promising startups as well as connecting with other leading blockchain companies out there.

Barcelona here we come ! 🚀

Scraping Bits Podcast: Pentestify’s Penetration – Generating Smart Contract Exploits With AI DRL

Scraping Bits Podcast: Pentestify's Penetration: Generating Smart Contract Exploits With AI DRL - Ft. Lucas Martin Calderon

25/09/2023

Pentestify co-founder Lucas Martin Calderon was invited to DeGatchi‘s Scraping Bits Podcast 🎙️🎙️ to talk about his journey as an entrepreneur in field of cybersecurity and how Pentestify represents the culmination of all his work in cybersecurity, blockchain and AI over the last decade 🔥

Before diving into how Pentestify is revolutionising smart contract security auditing with AI, DeGatchi and Lucas Martin Calderon talk about the evolution of AI and blockchain, as well as how our right to privacy 🔓 has reached a crossroad where two possible outcomes can occur depending on how we decide to use blockchain technology in the next couple of years…

Tune in to hear more about how we see the blockchain security space evolving over the next two years, as well as Pentestify’s plans for the coming months 🤫

Thanks for having us DeGatchi ! We really enjoyed the chat 

DappCon23 Day 2 – Blockchain Security Marries AI by Lucas Martin Calderon

DappCon23 Day 2 - Blockchain Security Marries AI by Lucas Martin Calderon

20/09/2023

Pentestify CEO, Lucas Martin Calderon was invited to the Dappcon 2023 Summit to talk about marrying AI and blockchain security.  In this talk, Lucas goes back to the basics of how AI actually works in order to get a better understanding of where it is heading: “Getting the basic right will help make a product that truly makes a difference and that will have the biggest impact”.  He then ties the latest development in both open and closed source AI models, their advantages and disadvantages and why Pentestify chose DRL and GNNs models to build NEO:  Automated, post deployment smart contract vulnerability detection and remediation SaaS. 

Find the full transcript below: 

Hello everybody! 

I’m super happy to be here especially because I’ve been thinking quite a lot about what to speak about today. Whether to speak quite technically about what AI is or comparing our tools to other world leading tools like static analyzers Dynamic analyzers or formal verification tools but at the end of the day I decided to truly to maximize the impact and go back to the basics of how AI works to understand where the future is heading so you can have the bases right and hence develop a product that really makes a difference.

As Elon Musk says: “one of the problems with with entrepreneurs with smart Engineers smart people is that we focus on the wrong problems so we try to optimize and maximize  problems that of things that shouldn’t even exist”

So who here in this room is afraid of AI replacing your job or actually taking a bit of your job, are there any takers?

Well by the end of this if you feel a little bit worse then that means that I’ve done a a good job.

Today we are going to talk about how a AI works why you should fear AI the ways to think about it. Same for the exact same thing for for the blockchain

So what is a neural network ?

You already have some inputs in the middle of the inputs and the outputs that’s where the layers come in right ? That’s where you have some weights. The weights could mean how strong the system, how strong the relations between these variables is and then those “B”s represent the bias so in a loss function of an AI model that means to minimize the error the loss function the cost function.  Then you have an activation function which could be softmax reu sigmoid to simplify the output. Finally you have the outputs which generally could be:  it’s a car, it’s a carpet, it’s a mug or that would be classification or it could be regression regarding numbers and so on.  So again we’ve talked a lot about and we’ve heard about uh chat GPT, LLMs, Transformers. What they do how they work. And I would like to give a very basic explanation but yet enough for you guys to understand why chat GPT or different LLMs wouldn’t be able to or wouldn’t be apt for certain vulnerabilities and so on.

I don’t know if you can read it well but chat GPT is made of or Transformer networks is made of encoders and decoders right ?

Encoders get your input when you put text into chat GPT and decoders actually get that context and decode and output the end result right? And it’s quite easy and well it’s quite important to understand how self attention works, that actually comes from a paper from Google called attention is all you need from 2016, it is quite quite old but it’s the basics for all we’ve got right now. So if you want to use these tools to find vulnerabilities in smart contracts that’s a bit more tailored to Smart contract security auditors, you already start wrong if you use a pre-train model.  Why? Because the attention mechanism of LLMs or of chat GPT is already tailored to optimize and maximize the probability of the next word right ? So according to your questions according to the tokens and the whole embedding what is the probability of the next word popping up right ? 

This is important to know because if you think about mathematics, if the smart contract does some mathematical reasoning. When you ask it 5 + 5 and it outputs 10 it doesn’t know what 10 means or it doesn’t know what addition means or what five represents/signifies. It simply means that after all the terabytes of training data, academic papers and so on, most of the time uh most of the times there is a 10 at the end of the thing so do you want that to be piloting/driving the security of your web3 company ? I I definitely hope no.

So when when thinking about all these things, on the right as you can see that’s that’s simply how AI sees the input text that you put. When we call artificial intelligence is it really intelligent ? how intelligent is it ? 

The stability CEO of stability AI said that by the end of 2024, and I might be wrong about this but we will be able to have chat GPT on our phones downloadable. And the size would be 5GB more or less.

Doesn’t that mean a lot already ? You’re able to train on terabytes of data.  You’re able to put a big portion of the internet, of the best books, of the smartest academic papers and then you’re able to download all that information into 5 gigabytes. 

Are we talking about a sort of compression algorithm ? Are we talking about something else ? Well that something else is what we interpret as intelligence but however we plan to transpose our intelligence to what this AI how AI works right ?  Which is something completely different and this is why when really thinking about security and even though certain models might output certain vulnerabilities of smart contracts,  you truly need to understand how it works first, what kind of intelligence or compression algorithm it has,  or pattern recognition, to know the limits of its creativity, of how it works and so on.  

This image is simply what the AI sees in the in the first step. 

So if you see on this slide, these are the word embeddings. All the features and all the rows, each row is a token and each column is a feature of the token, a dimension. A dimension could represent different things that we couldn’t even imagine in that case it might be word similarity semantic similarity and it’s quite important for the self attention algorithm.

This is taken actually from Jay Alsom one of the best Illustrated Engineers that are able to express with graphics how AI works and in this case for example, it says “the animal didn’t cross the street because it was too tired” when you refer back to the whole graph to the whole way of seeing things by the AI, what does it represent ? Does it go back to the animal or does it mean the street right ? 

This is where attention comes in and where it’s tailored to certain parameters. In this case, chat GPT is tailored to make sense for the questions that the user asked them but it’s definitely not engineered, or made to find vulnerabilities in smart contracts.

Now however, we would like to talk about the opposite of the Monopoly of AI and why AI companies should fear AI. Why it is really down to the community, and this is the message that I want to put across it should be up to us, up to the community to really drive and set the pace for the future of the developments in web 3 including AI and blockchain and not to these big corporations.

So GPT4 was the first commercially available one, even though Google started way before. Then we’ve got Cohere, Stability AI, Anthropic. In this case it even seems that Mark Zuckerberg is actually one of the good guys as Meta is one of the biggest research companies putting their the research out there for free. 

This is how fast AI has been working. Since the first activity was released on February 24th Lama was released but it was open source and yet the weights of the model, which means the intelligence after the training model, they were private somehow they got released from forchan, and since then there has been a super fast acceleration of of people building on top of these tools. To give you an exampl, on March 13 it was already running on consumer Hardware, on March 19th “Vicuna” and other models already surpassed Google’s “Bard”, so something that was around $200 million to train in the AI uh database and Computing zones. It already takes $300 uh to to fine-tune the model. Fine-tuning simply means making it better after it’s already been pre-trained or tailored to your needs.

GPT for all launches and it already creates an entire ecosystem, the biggest LLMs open source libraries like “Lang Chain” like Wev8 the vector database already tried to pop up and catch up with a speed. 

On March 30th Bloomberg GPT launches and shortly after FiGPT, so Finance GPT the open source version of that launches as well so as you can see since the model, since the weights of Lama were put uh out in the wild by well definitely not by meta but by someone else, I guess an Insider. In in less than six weeks we were already able to get the same performance or very similar accuracy F1 scores precision than models that took years of development and hundreds of millions of of dollars to train so this is again the mote. There was as well a leaked um article by Google saying “open AI has no mote and neither do we”.  By mote they simply mean Competitive Edge or competitive advantage. It was already possible given certain Technologies certain advancements in the field of AI allowed all of that to to happen, however it is important to note that you’re able to fine-tune the model simply by prompt so you don’t need to do that technically prompt and giving context already kind of helps that out.

However speaking of the technologies that actually helped the open source Community carry all this advancement forwards and actually achieve the same precision than Google or Open AI were two main Technologies: quantization and Lora. Lura is not to be confused in the electronics field but it is a lower rank adaptation that allows actually to reduce the number of trainable parameters by 10,000 times which actually reduces and helps free up the bottleneck of AI training which is generally the GPUs. Then quantization means faster inference, it saves money and when the training happens and an inference you’re able to reduce the vector space which is used to train the model.

Now the problems with generative AI and this is really where the research that I’ve been doing at Pentestify in collaboration with University College London, we really wanted to approach it in the most natural way, and by natural I mean one of the humans best inventions or rather than inventions, executions and Engineering achievements were actually when we observed nature. So what best to represent how a human finds vulnerabilities and things of vulnerabilities than actually studying the brain and how it interacts when presented with a smart contract ?

So we took 50 Smart contract auditors and we put them on an MRI machine to know the areas of the brain that activated when doing these things.  A lot of noise happened but we were thank god helped by medical professionals that know exactly how to interpret and read that data, and actually it aligned very well with the philosophy at the beginning that we carried at Pentestify. It was what kind of AI should we develop to uh to really find vulnerabilities without having the same weaknesses as open AI or Chat GPT. Again we don’t want to optimize a problem of an issue that shouldn’t exist in the first place right ? 

So we realized that the answer was actually the true intelligence for this very specific task was mixing different types of intelligence and we realized that when spatial and vision activated in the brain there’s already an AI model tailored for that and it’s called graph neural networks that are able to think sequentially through time and through functions. Again, as smart contract auditor when you’re creating or auditing the code you might think of the control flow or data flow diagram right? But you might be thinking directly with all the different function calls across time or in a different order that it was already meant to. So graph neural networks are indeed able to to achieve that task and have different information in parallel at the same time. 

The language area in the brain already was quite active and for that we use part of LLMs like chat GPT but only up to the embeddings point and the embeddings against it is the representation of the tokens of the words that you put in and we wanted to make sure that the attention algorithm, the attention mechanism was tailored to find vulnerabilities  instead of to understand the semantic context of it. Because at the end of the day we don’t want the AI to understand the smart contract per se, the semantics of it and we want to understand the different in instruction sets and its interactions with the evm. There were a bunch as well of of algorithms like short-term long memory, algorithms and so on for prediction, mathematical reasoning that we already know that it might not be the best at in fact even though it improves it, it is definitely from the base from scratch the wrong algorithm to think and the wrong AI model to think for for these things.

The way blockchain evolves, the good ways the bad ways. Well the fact that that is on red doesn’t mean that I don’t agree with with those two but are definitely something that we should take into consideration. The fact that both things exists at the same time and the need for both things to exist at the same time. For the first time in history our money in banks will be programmable that means that if you receive money from your job or from even your own venture you might need to spend it in a timely manner because otherwise you could get burned or you won’t be able to route it through certain channels because it would be forbidden. Again control has never been so active.

 So what is the best way to marry blockchain and AI in the context of security ? And why is it so interesting to mix them ? Is it simply because in web 2 AI was so prevalent ? Well in this case it really helps the infrastructure of blockchain the availability of the data is there for the first time public. We’ve given up uh the control for our data to be public, for certain transactions to be public for at least the pleasure of not being stored in a centralized server that someone else controls. 

The sector is about to change as well with different encryption schemes: FHE,  ZK are definitely there to change the game and if you want to simulate different attacks on the blockchain it’s never been so easy with an infrastructure that you can Fork that you can literally copy and simulate.  

Why is the security aspect so important ? Even though uh this is not a double sale that I to make here I’m sure but in 2023 alone there was over three billion stolen dollars. 70% came from Smart contracts and 92% of the smart contracts were already audited by top firms. This is nothing to say of these top firms but rather the fact that it evolves. That new vulnerabilities have been found, and when referring to most tools, so static analyzers and dynamic anal analyzers… they have predefined instructions, predefined vulnerabilities so again the very basics of AI, of deep learning is being able to train to inference on unseen data and being able to infer new vulnerabilities that you haven’t even learned the patterns before. Or hasn’t been entered by an expert.

So that’s what we do at Pentestify through five different AI models. You simply give us the address of the smart contract and we get all the vulnerability, all the dependencies, all the graph from the different smart contracts and across time we continuously monitor it. We extract the vulnerable patterns we store them in a database and although many of them are not vulnerable but given that it is in a multi-dimensional space that not even humans can understand as you saw in the graph embeddings AI uses way more dimensions that what we can possibly imagine, that remains on the database until a smart contract with a similar uh vulnerability is found and then we alert the team immediately. We are also able to find uh variations of vulnerabilities like reentrancy that a static analyzer wouldn’t detect unless it receives an update.

So yeah thank you very much more than happy to answer any questions now or after. I was only able to touch upon an overview many many topics but happy to answer any questions.

Yes now it’s time for questions so if anyone has questions please raise your hands

Lucas Martin Calderon – Speaker for the DappCon summit 2023

Lucas Martin Calderon - Speaker for the DappCon Summit 2023

04/09/2023

Pentestify’s CEO has been invited to the DappCon Summit 2023 to give a talk on the role AI can play in detecting and remediating smart contract vulnerabilities for DeFi protocols.  Lucas will be covering the latest tools and techniques used by top smart contract security auditors and demonstrating the latest advancement in AI in comparison to traditional formal, dynamic and static security auditing methodologies. 

Book your tickets here.