(Summary)Biscotti: A Ledger for Private and Secure Peer-to-Peer Machine Learning

Yuan Ko
Disassembly

--

Biscotti is a platform combined Blockchain and federated learning, in order to protect the privacy and maintain the accuracy of federated learning at the same time.

In federated learning, we have few challenges , such as, Sybil attacks, poisoning attack, information leakage attacks and utility loss with differential privacy.

Biscotti provide some opinion of these challenges. Let's figure out what biscotti do and how its architecture work.

It has three roles : noiser, verifier, aggregater

  • noiser : to provide a group of noise to prevent leakage
  • verifier : receive masked update and use multi-KRUM to filter out the poisoned update
  • aggregater : use secret sharing scheme to aggregate the update and create new block with aggregated gradient.

Here comes the architecture and the procedure biscotti.

Goal of biscotti

  • Convergence to an optimal global model (the model trained without adversaries in a federated learning setting)
  • Poisoning is prevented by verifying peer contributions to the model
  • Peer training data is kept private and information leakage attacks on training data are prevented
  • Colluding peers cannot gain influence without acquiring sufficient stake

Blockchain assumption and Attacker assumption

Biscotti uses POS as consensus and peer gain stake by providing SGD update or by facilitating the consensus process. Each peer is connected to some subset that allows flooding-based dissemination of block. Therefore, peer can be offline during training and join later. I suppose that is why biscotti doesn’t use DPOS as consensus.

All the model parameters are know to all peer, such as its hyperparameters, its optimization algorithm.

Assume that adversaries will perform poisoning attack to harm the performance only using label flipping poisoning attack.

How does biscotti select peer?

To select verifier and aggregater, biscotti uses SHA-256 to repeatedly re-hash the last block and map it to the proportionally assigned based on the stake.

To select noiser, each peer can arrive at a unique committee for itself via consistent hash by using different initial hash. In Biscotti, we get unique results by passing secret key(the key vrf will create not the secret key of the peer) and SHA-256 hash from the last block to the VRF.

I am wondering that VRF use the secret key to encrypt the hash number from SHA-256.

Noising protocol

Noise would pre-committed noise to thwart poisoning. Attacker may add noise N to the poisoned update W to pretend a honest update, so Biscotti ask to pre-committed noise to genesis block. In this way, noise can't generated in advance and unpoisons a update.

Chose a set of noiser to thrawt information leakage. If the noiser A collude with the aggregater B to leak the information of victim C. A may provide zero noise, so C can not hide the gradient and information will leak when gradient send to verifier B.

It motivate Biscotti use VRF that select a group of noising peers based on C's secret key.

Verification protocol

Verifier peer is responsible to filter the poison update and reject it with Multi-KRUM.

Verifier will receive:

  • The masked SGD update: (∆wi +∑k ζk)
  • Commitment to the SGD update: COMM(∆wi)
  • The set of k noise commitments: {COMM(ζ1),COMM(ζ2),…,COMM(ζk))}
  • A VRF proof confirming the identity of the k noise peers

A masked update is legitimate if following equality holds based on the homomorphic property:

Use multi-KRUM to filter poison update as follows after the update is legitimate.

Aggregation protocol

Aggregating face a dilemma : no individual update should be observe, but sum must be stored in a block. Biscotti solve the dilemma by combining the polynomial commitments with verifiable secret sharing of individual update.

The update of length d is encoded as d- degree polynomial and Biscotti will separated unmasked update into n shares (n = 2 ∗ (d + 1)) which is distribute among aggregater. Aggregater will aggregate the share they have and distribute it to other aggregater. You won't be able to recognize individual update since aggregater broadcast the sum of the share they accepted. You can aggregate the complete update when you control d+1 which is half of the aggregater.

All the thing we mention above is to create a architecture of federated learning which can resist the poison attack and still maintain the performance of the model.

Evaluation

  • Minimum committee size to prevent collusion

probability of committee selection is base on the stake, so we can use the binomial distribution to obtain the loss upper bound.

Lager committee size, more stake the adversarial need to own.

  • Tolerating posioning attacks

You have to collect more data for multi-KRUM to maintain the right gradient with more percentage of update being poisoned.

Need to collect 70% of update to prevent 30% of poisoner.

There are some other evaluation of it. Summary it when I got leisure time.

Some problem for myself:

Q1: What noise did noiser provide?

Noise is sampled from a normal distribution.

Q2: Why we have to record update to blockchain?

It’s for security.Update will be able to verify by everyone in the blockchain, moreover, peer can join to the training process even they are not online when the training start.

Q3 : Not clear in noiser selection

It is a little different to the selection of verifier and aggregater. It will pass the hash from last block to the VRF and get a random number and proof of random number. With the random number passing into SHA256, we will have the lottery ticket for the lottery.

Q4: Why don’t peers add the noise by theirself?

We can know whether peer is honest or not, so if they add a noise to unpoision the data, it won’t be recognize by multi-KRUM.

Q5: Update data will be separate by shamir secret share, but how can we separate the signed update ? Can the aggregater verify the update is from verifier?

I will read the code to check the process. Record the problem here. I will figure it out later.

I believe that combine blockchain with federated learning is to provide a way to resist attack from malicious peer. It can also extend to the concept of value of data which is not a very popular discuss now.

Reference

--

--