This tutorial shows you how to connect SlateDB to S3. We'll use LocalStack to simulate S3.
Create a project#
Let's start by creating a new Rust project:
cargo init slatedb-playground
cd slatedb-playground
Add dependencies#
Now add SlateDB and the required dependencies to your Cargo.toml:
cargo add slatedb tokio --features tokio/macros,tokio/rt-multi-thread
cargo add object-store --features object-store/aws
cargo add anyhow
Setup#
You will need to have LocalStack running. You can install it using Homebrew:
brew install localstack/tap/localstack-cli
localstack start -d
For a more detailed setup, see the LocalStack documentation.
You'll also need the AWS CLI:
brew install awscli
Initialize AWS#
SlateDB requires a bucket to work with S3.
Create your S3 bucket:#
# Create S3 bucket
aws --endpoint-url=http://localhost:4566 s3api create-bucket --bucket slatedb --region us-east-1
Write some code#
Stick this into your src/main.rs file:
import { Code } from '@astrojs/starlight/components';
import s3Example from '/../examples/src/s3_compatible.rs?raw'
Run the code#
Now you can run the code:
cargo run
This will write a 64 MiB value to SlateDB.
Check the results#
Now' let's check the database path in the bucket:
% aws --endpoint-url=http://localhost:4566 s3 ls s3://slatedb/tmp/slatedb_s3_compatible/
PRE compacted/
PRE compactions/
PRE manifest/
PRE wal/
There are four folders:
manifest: Contains the manifest files. Manifest files define the state of the DB, including the set of SSTs that are part of the DB.
wal: Contains the write-ahead log files.
compacted: Contains the compacted SST files (may not appear in short examples).
compactions: Contains the compaction-state snapshots.
Let's check the wal folder:
% aws --endpoint-url=http://localhost:4566 s3 ls s3://slatedb/tmp/slatedb_s3_compatible/wal/
2024-09-04 18:05:57 64 00000000000000000001.sst
2024-09-04 18:05:58 67108996 00000000000000000002.sst
Each of these SST files is a write-ahead log (WAL) file, and each WAL file can contain many RowEntry values. They get flushed based on the flush_interval config. The last WAL file is 64 MiB because it contains the value we wrote.
Finally, let's check the compacted folder:
% aws --endpoint-url=http://localhost:4566 s3 ls s3://slatedb/tmp/slatedb_s3_compatible/compacted/
2024-09-04 18:05:59 67108996 01J6ZVEZ394GCJT1PHZYY1NZGP.sst
Again, we see the 64 MiB SST file. This is the L0 SST file that was flushed with our value. Over time, the WAL entries will be removed, and the L0 SSTs will be compacted into higher levels.