Skip to content

Commit

Permalink
make_sql_engine and notebook experience (#170)
Browse files Browse the repository at this point in the history
* make_sql_engine

* update

* update

* update

* update

* update

* update

* switch to use qpd as the default sql engine for native execution engine

* update

* update comments

* update

* update notebook comments and readme
  • Loading branch information
goodwanghan authored Feb 4, 2021
1 parent c9feda1 commit 4b6d945
Show file tree
Hide file tree
Showing 19 changed files with 779 additions and 161 deletions.
1 change: 1 addition & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ package:
python3 setup.py bdist_wheel

jupyter:
mkdir -p tmp
pip install .
jupyter nbextension install --py fugue_notebook
jupyter nbextension enable fugue_notebook --py
Expand Down
48 changes: 48 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,54 @@ For example a common use case is:
pip install fugue[sql,spark]
```

## Jupyter Notebook Extension (since 0.5.1)

```bash
pip install fugue
jupyter nbextension install --py fugue_notebook
jupyter nbextension enable fugue_notebook --py
```

After installing the Jupyter extension, you can have `%%fsql` magic cells, where
the Fugue SQL inside the cell will be highlighted.

We are also able to run this fsql magic cell if you load the ipython extension,
here is an example:

In cell 1

```
%load_ext fugue_notebook
```

In cell 2

```
%%fsql
CREATE [[0]] SCHEMA a:int
PRINT
```

In cell 3 where you want to use dask

```
%%fsql dask
CREATE [[0]] SCHEMA a:int
PRINT
```

Note that you can automatically load `fugue_notebook` ipthon extension at startup,
read [this](https://ipython.readthedocs.io/en/stable/config/extensions/#using-extensions) to configure your jupyter environment.

There is an ad-hoc way to setup your notebook environment, you don't need to install anything or change the startup script.
You only need to do the following at the first cell of each of your notebook, and you will get highlights and `%%fsql` cells become runnable too:


```python
from fugue_notebook import setup
setup()
```

## Contributing

Feel free to message us on [Slack](https://join.slack.com/t/fugue-project/shared_invite/zt-jl0pcahu-KdlSOgi~fP50TZWmNxdWYQ). We also have [contributing instructions](CONTRIBUTING.md).
9 changes: 8 additions & 1 deletion fugue/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,17 @@
from fugue.execution.execution_engine import ExecutionEngine, SQLEngine
from fugue.execution.factory import (
make_execution_engine,
make_sql_engine,
register_default_execution_engine,
register_default_sql_engine,
register_execution_engine,
register_sql_engine,
)
from fugue.execution.native_execution_engine import (
NativeExecutionEngine,
QPDPandasEngine,
SqliteEngine,
)
from fugue.execution.native_execution_engine import NativeExecutionEngine, SqliteEngine
from fugue.extensions.creator import Creator, creator
from fugue.extensions.outputter import Outputter, outputter
from fugue.extensions.processor import Processor, processor
Expand Down
23 changes: 22 additions & 1 deletion fugue/execution/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,28 @@
from fugue.execution.execution_engine import ExecutionEngine, SQLEngine
from fugue.execution.factory import (
make_execution_engine,
make_sql_engine,
register_default_execution_engine,
register_default_sql_engine,
register_execution_engine,
register_sql_engine,
)
from fugue.execution.native_execution_engine import (
NativeExecutionEngine,
SqliteEngine,
QPDPandasEngine,
)

register_execution_engine(
"native", lambda conf: NativeExecutionEngine(conf), on_dup="ignore"
)
register_execution_engine(
"pandas", lambda conf: NativeExecutionEngine(conf), on_dup="ignore"
)
register_sql_engine("sqlite", lambda engine: SqliteEngine(engine), on_dup="ignore")
register_sql_engine(
"qpdpandas", lambda engine: QPDPandasEngine(engine), on_dup="ignore"
)
register_sql_engine(
"qpd_pandas", lambda engine: QPDPandasEngine(engine), on_dup="ignore"
)
from fugue.execution.native_execution_engine import NativeExecutionEngine, SqliteEngine
20 changes: 20 additions & 0 deletions fugue/execution/execution_engine.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@ def __init__(self, conf: Any):
self._rpc_server = make_rpc_server(self.conf)
self._engine_start_lock = RLock()
self._engine_start = 0
self._sql_engine: Optional[SQLEngine] = None

def start(self) -> None:
"""Start this execution engine, do not override.
Expand Down Expand Up @@ -133,8 +134,27 @@ def conf(self) -> ParamDict:

@property
def rpc_server(self) -> RPCServer:
""":class:`~fugue.rpc.base.RPCServer` of this execution engine"""
return self._rpc_server

@property
def sql_engine(self) -> SQLEngine:
"""The :class:`~.SQLEngine` currently used by this execution engine.
You should use :meth:`~.set_sql_engine` to set a new SQLEngine
instance. If not set, the default is :meth:`~.default_sql_engine`
"""
if self._sql_engine is None:
self._sql_engine = self.default_sql_engine
return self._sql_engine

def set_sql_engine(self, engine: SQLEngine) -> None:
"""Set :class:`~.SQLEngine` for this execution engine.
If not set, the default is :meth:`~.default_sql_engine`
:param engine: :class:`~.SQLEngine` instance
"""
self._sql_engine = engine

@property
@abstractmethod
def log(self) -> logging.Logger: # pragma: no cover
Expand Down
Loading

0 comments on commit 4b6d945

Please sign in to comment.