Skip to content

Commit

Permalink
Merge branch 'master' into fix-null-constraint
Browse files Browse the repository at this point in the history
  • Loading branch information
Kimahriman committed Jun 22, 2024
2 parents 84a0b17 + 6b01387 commit 72eaaf0
Show file tree
Hide file tree
Showing 4,462 changed files with 581,664 additions and 73,231 deletions.
The diff you're trying to view is too large. We only load the first 3000 changed files.
12 changes: 12 additions & 0 deletions .github/ISSUE_TEMPLATE/bug-issue.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,18 @@ title: '[BUG]'

## Bug

#### Which Delta project/connector is this regarding?
<!--
Please add the component selected below to the beginning of the issue title
For example: [BUG][Spark] Title of my issue
-->

- [ ] Spark
- [ ] Standalone
- [ ] Flink
- [ ] Kernel
- [ ] Other (fill in here)

### Describe the problem

#### Steps to reproduce
Expand Down
12 changes: 12 additions & 0 deletions .github/ISSUE_TEMPLATE/feature-request.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,18 @@ title: '[Feature Request]'

## Feature request

#### Which Delta project/connector is this regarding?
<!--
Please add the component selected below to the beginning of the issue title
For example: [Feature Request][Spark] Title of my issue
-->

- [ ] Spark
- [ ] Standalone
- [ ] Flink
- [ ] Kernel
- [ ] Other (fill in here)

### Overview

<!-- Provide a high-level description of the feature request. -->
Expand Down
28 changes: 28 additions & 0 deletions .github/ISSUE_TEMPLATE/protocol-rfc.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
---
name: Protocol Change Request
about: Use this template to propose a new feature that impacts the Delta protocol specification
labels: 'protocol'
title: '[PROTOCOL RFC]'

---

## Protocol Change Request

### Description of the protocol change

<!--
Please describe the motivation and high-level description of the protocol change you are proposing.
For a fairly large protocol change, it is recommended that you provide a design doc - (e.g., a google doc, preferably with the ability to comment in the doc).
For the next steps on how to proceed with the request, see the protocol RFC process in https://github.com/delta-io/delta/tree/master/protocol_rfcs
-->


### Willingness to contribute

The Delta Lake Community encourages protocol innovations. Would you or another member of your organization be willing to contribute this feature to the Delta Lake code base?

- [ ] Yes. I can contribute.
- [ ] Yes. I would be willing to contribute with guidance from the Delta Lake community.
- [ ] No. I cannot contribute at this time.


12 changes: 12 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,18 @@ Thanks for sending a pull request! Here are some tips for you:
6. If applicable, include the corresponding issue number in the PR title and link it in the body.
-->

#### Which Delta project/connector is this regarding?
<!--
Please add the component selected below to the beginning of the pull request title
For example: [Spark] Title of my pull request
-->

- [ ] Spark
- [ ] Standalone
- [ ] Flink
- [ ] Kernel
- [ ] Other (fill in here)

## Description

<!--
Expand Down
45 changes: 45 additions & 0 deletions .github/workflows/connectors_test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
name: "Delta Connectors Tests"
on: [push, pull_request]
jobs:
build:
name: "Run tests"
runs-on: ubuntu-20.04
strategy:
matrix:
# These Scala versions must match those in the build.sbt
scala: [2.13.13, 2.12.18]
steps:
- uses: actions/checkout@v2
- name: install java
uses: actions/setup-java@v2
with:
distribution: 'zulu'
java-version: '8'
- name: Cache Scala, SBT
uses: actions/cache@v2
with:
path: |
~/.sbt
~/.ivy2
~/.cache/coursier
~/.m2
key: build-cache-3-with-scala_${{ matrix.scala }}
- name: Run Scala Style tests on test sources (Scala 2.12 only)
run: build/sbt "++ ${{ matrix.scala }}" testScalastyle
if: startsWith(matrix.scala, '2.12.')
- name: Run sqlDeltaImport tests (Scala 2.12 and 2.13 only)
run: build/sbt "++ ${{ matrix.scala }}" sqlDeltaImport/test
if: ${{ !startsWith(matrix.scala, '2.11.') }}
# These tests are not working yet
# - name: Run Delta Standalone Compatibility tests (Scala 2.12 only)
# run: build/sbt "++ ${{ matrix.scala }}" compatibility/test
# if: startsWith(matrix.scala, '2.12.')
- name: Run Delta Standalone tests
run: build/sbt "++ ${{ matrix.scala }}" standalone/test testStandaloneCosmetic/test standaloneParquet/test testParquetUtilsWithStandaloneCosmetic/test
- name: Run Hive 3 tests
run: build/sbt "++ ${{ matrix.scala }}" hiveMR/test hiveTez/test
- name: Run Hive 2 tests
run: build/sbt "++ ${{ matrix.scala }}" hive2MR/test hive2Tez/test
- name: Run Flink tests (Scala 2.12 only)
run: build/sbt -mem 3000 "++ ${{ matrix.scala }}" flink/test
if: ${{ startsWith(matrix.scala, '2.12.') }}
51 changes: 51 additions & 0 deletions .github/workflows/kernel_docs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Simple workflow for deploying static content to GitHub Pages
name: Deploy static content to Pages

on:
# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:

# Sets permissions of the GITHUB_TOKEN to allow deployment to GitHub Pages
permissions:
contents: read
pages: write
id-token: write

# Allow only one concurrent deployment, skipping runs queued between the run in-progress and latest queued.
# However, do NOT cancel in-progress runs as we want to allow these production deployments to complete.
concurrency:
group: "pages"
cancel-in-progress: false

jobs:
# Single deploy job since we're just deploying
deploy_docs:
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v3
- name: install java
uses: actions/setup-java@v3
with:
distribution: "zulu"
java-version: "8"
- name: Generate docs
run: |
build/sbt kernelGroup/unidoc
mkdir -p kernel/docs/snapshot/kernel-api/java
mkdir -p kernel/docs/snapshot/kernel-defaults/java
cp -r kernel/kernel-api/target/javaunidoc/. kernel/docs/snapshot/kernel-api/java/
cp -r kernel/kernel-defaults/target/javaunidoc/. kernel/docs/snapshot/kernel-defaults/java/
- name: Setup Pages
uses: actions/configure-pages@v3
- name: Upload artifact
uses: actions/upload-pages-artifact@v1
with:
# Upload kernel docs
path: kernel/docs
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v2
20 changes: 20 additions & 0 deletions .github/workflows/kernel_test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
name: "Delta Kernel Tests"
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-20.04
env:
SCALA_VERSION: 2.12.18
steps:
- uses: actions/checkout@v3
- name: install java
uses: actions/setup-java@v3
with:
distribution: "zulu"
java-version: "8"
- name: Run tests
run: |
python run-tests.py --group kernel --coverage
- name: Run integration tests
run: |
cd kernel/examples && python run-kernel-examples.py --use-local
55 changes: 55 additions & 0 deletions .github/workflows/spark_examples_test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
name: "Delta Spark Local Publishing and Examples Compilation"
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-20.04
strategy:
matrix:
# These Scala versions must match those in the build.sbt
scala: [2.12.18, 2.13.13]
env:
SCALA_VERSION: ${{ matrix.scala }}
steps:
- uses: actions/checkout@v3
- uses: technote-space/get-diff-action@v4
id: git-diff
with:
PATTERNS: |
**
.github/workflows/**
!kernel/**
!connectors/**
- name: install java
uses: actions/setup-java@v3
with:
distribution: "zulu"
java-version: "8"
- name: Cache Scala, SBT
uses: actions/cache@v3
with:
path: |
~/.sbt
~/.ivy2
~/.cache/coursier
# Change the key if dependencies are changed. For each key, GitHub Actions will cache the
# the above directories when we use the key for the first time. After that, each run will
# just use the cache. The cache is immutable so we need to use a new key when trying to
# cache new stuff.
key: delta-sbt-cache-spark-examples-scala${{ matrix.scala }}
- name: Install Job dependencies
run: |
sudo apt-get update
sudo apt-get install -y make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev python-openssl git
sudo apt install libedit-dev
if: steps.git-diff.outputs.diff
- name: Run Delta Spark Local Publishing and Examples Compilation
# examples/scala/build.sbt will compile against the local Delta relase version (e.g. 3.2.0-SNAPSHOT).
# Thus, we need to publishM2 first so those jars are locally accessible.
# We publish storage explicitly so that it is available for the Scala 2.13 build. As a java project
# it is typically only released when publishing for Scala 2.12.
run: |
build/sbt clean
build/sbt storage/publishM2
build/sbt "++ $SCALA_VERSION publishM2"
cd examples/scala && build/sbt "++ $SCALA_VERSION compile"
if: steps.git-diff.outputs.diff
51 changes: 51 additions & 0 deletions .github/workflows/spark_master_test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
name: "Delta Spark Master Tests"
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-20.04
strategy:
matrix:
# These Scala versions must match those in the build.sbt
scala: [2.13.13]
env:
SCALA_VERSION: ${{ matrix.scala }}
steps:
- uses: actions/checkout@v3
- uses: technote-space/get-diff-action@v4
id: git-diff
with:
PATTERNS: |
**
.github/workflows/**
!kernel/**
!connectors/**
- name: install java
uses: actions/setup-java@v3
with:
distribution: "zulu"
java-version: "17"
- name: Cache Scala, SBT
uses: actions/cache@v3
with:
path: |
~/.sbt
~/.ivy2
~/.cache/coursier
!~/.cache/coursier/v1/https/repository.apache.org/content/groups/snapshots
# Change the key if dependencies are changed. For each key, GitHub Actions will cache the
# the above directories when we use the key for the first time. After that, each run will
# just use the cache. The cache is immutable so we need to use a new key when trying to
# cache new stuff.
key: delta-sbt-cache-spark-master-scala${{ matrix.scala }}
- name: Install Job dependencies
run: |
sudo apt-get update
sudo apt-get install -y make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev python-openssl git
sudo apt install libedit-dev
if: steps.git-diff.outputs.diff
- name: Run Spark Master tests
# when changing TEST_PARALLELISM_COUNT make sure to also change it in spark_test.yaml
run: |
TEST_PARALLELISM_COUNT=2 build/sbt -DsparkVersion=master "++ ${{ matrix.scala }}" clean spark/test
TEST_PARALLELISM_COUNT=2 build/sbt -DsparkVersion=master "++ ${{ matrix.scala }}" clean connectServer/test
if: steps.git-diff.outputs.diff
76 changes: 76 additions & 0 deletions .github/workflows/spark_test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
name: "Delta Spark Tests"
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-20.04
strategy:
matrix:
# These Scala versions must match those in the build.sbt
scala: [2.12.18, 2.13.13]
env:
SCALA_VERSION: ${{ matrix.scala }}
steps:
- uses: actions/checkout@v3
- uses: technote-space/get-diff-action@v4
id: git-diff
with:
PATTERNS: |
**
.github/workflows/**
!kernel/**
!connectors/**
- name: install java
uses: actions/setup-java@v3
with:
distribution: "zulu"
java-version: "8"
- name: Cache Scala, SBT
uses: actions/cache@v3
with:
path: |
~/.sbt
~/.ivy2
~/.cache/coursier
# Change the key if dependencies are changed. For each key, GitHub Actions will cache the
# the above directories when we use the key for the first time. After that, each run will
# just use the cache. The cache is immutable so we need to use a new key when trying to
# cache new stuff.
key: delta-sbt-cache-spark3.2-scala${{ matrix.scala }}
- name: Install Job dependencies
run: |
sudo apt-get update
sudo apt-get install -y make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev python-openssl git
sudo apt install libedit-dev
curl -LO https://github.com/bufbuild/buf/releases/download/v1.28.1/buf-Linux-x86_64.tar.gz
mkdir -p ~/buf
tar -xvzf buf-Linux-x86_64.tar.gz -C ~/buf --strip-components 1
rm buf-Linux-x86_64.tar.gz
sudo apt install python3-pip --fix-missing
sudo pip3 install pipenv==2021.5.29
curl https://pyenv.run | bash
export PATH="~/.pyenv/bin:$PATH"
eval "$(pyenv init -)"
eval "$(pyenv virtualenv-init -)"
pyenv install 3.8.18
pyenv global system 3.8.18
pipenv --python 3.8 install
pipenv run pip install pyspark==3.5.0
pipenv run pip install flake8==3.5.0 pypandoc==1.3.3
pipenv run pip install black==23.9.1
pipenv run pip install importlib_metadata==3.10.0
pipenv run pip install mypy==0.982
pipenv run pip install mypy-protobuf==3.3.0
pipenv run pip install cryptography==37.0.4
pipenv run pip install twine==4.0.1
pipenv run pip install wheel==0.33.4
pipenv run pip install setuptools==41.1.0
pipenv run pip install pydocstyle==3.0.0
pipenv run pip install pandas==1.1.3
pipenv run pip install pyarrow==8.0.0
pipenv run pip install numpy==1.20.3
if: steps.git-diff.outputs.diff
- name: Run Scala/Java and Python tests
# when changing TEST_PARALLELISM_COUNT make sure to also change it in spark_master_test.yaml
run: |
TEST_PARALLELISM_COUNT=2 pipenv run python run-tests.py --group spark
if: steps.git-diff.outputs.diff
Loading

0 comments on commit 72eaaf0

Please sign in to comment.