Linq; expression parser?

Marc Gravell

In Linq, you can apparently get a meaningful body from and
expression's .ToString(); random question - does anybody know if linq
also includes a parser? It just seemed it might be a handy way to
write a safe but easy implementation (i.e. no codedom) for an
IBindingListView.Filter (by compiling to a Predicate<T>).

Anybody know if this is possible at all?

Marc

May 25 '07 #1

Subscribe Post Reply

16335

Frans Bouma [C# MVP]

Marc Gravell wrote:

In Linq, you can apparently get a meaningful body from and
expression's .ToString(); random question - does anybody know if linq
also includes a parser? It just seemed it might be a handy way to
write a safe but easy implementation (i.e. no codedom) for an
IBindingListView.Filter (by compiling to a Predicate<T>).

Anybody know if this is possible at all?

Why would you use a parser on the string output? Because that parser
will produce a parse tree which will look very similar to the
expression tree you called ToString() on :)

So interpret the expression tree instead.

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------

May 26 '07 #2

Jon Skeet [C# MVP]

Frans Bouma [C# MVP] <pe******************@xs4all.nlwrote:

Marc Gravell wrote:

In Linq, you can apparently get a meaningful body from and
expression's .ToString(); random question - does anybody know if linq
also includes a parser? It just seemed it might be a handy way to
write a safe but easy implementation (i.e. no codedom) for an
IBindingListView.Filter (by compiling to a Predicate<T>).

Anybody know if this is possible at all?

Why would you use a parser on the string output? Because that parser
will produce a parse tree which will look very similar to the
expression tree you called ToString() on :)

So interpret the expression tree instead.

I think the point would be to parse expression trees which *hadn't*
been created from ToString, but read in from a file etc. I understood
Marc's introductory sentence to effectively mean: "there's a useful
format for LINQ expressions, as shown by the ToString method".

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

May 26 '07 #3

Marc Gravell

Jon has correctly interpreted my witterings... my point being, that
unlike raw .Net 2.0 predicates (etc), LINQ expression trees are quite
structured. Obviously the C# 3-series compiler can parse source to
create an expression, but it is unclear to me (at least until
Reflector supports CLR 3.5 ;-p) how much of this is the compiler and
how much is the runtime.

The ability to parse a string to an expression would be very
powerful... although there would obviously be some limitations in
terms if resolving external entities... but heck: just access to the
expression arguments and literals would be pretty powerful.

Maybe I'm just "off on one"...

Marc

May 26 '07 #4

Frans Bouma [C# MVP]

Jon Skeet [C# MVP] wrote:

Frans Bouma [C# MVP] <pe******************@xs4all.nlwrote:
Marc Gravell wrote:

In Linq, you can apparently get a meaningful body from and
expression's .ToString(); random question - does anybody know if
linq also includes a parser? It just seemed it might be a handy
way to write a safe but easy implementation (i.e. no codedom) for
an IBindingListView.Filter (by compiling to a Predicate<T>).
>
Anybody know if this is possible at all?
Why would you use a parser on the string output? Because that
parser will produce a parse tree which will look very similar to the
expression tree you called ToString() on :)

So interpret the expression tree instead.

I think the point would be to parse expression trees which hadn't
been created from ToString, but read in from a file etc. I understood
Marc's introductory sentence to effectively mean: "there's a useful
format for LINQ expressions, as shown by the ToString method".

I 'm not sure if I understand you correctly. The C# sourcecode is
parsed to code which builds an expression tree (if I understood
everything correctly) which at runtime results in an Expression tree
with objects. This tree is then passed to the object which is the
source for the query, for example the o/r mapper engine which will
convert the tree to a sql query.

Expression trees aren't serializable, so to store them you need a
representation, which could be a string indeed. However I don't see a
use case for that, as the expression tree only lives at runtime anyway.

Now, to parse the string, you WILL end up with a parse tree (or your
parser isn't that maintainable ;)). This parse tree has likely a lot in
common with the expression tree, so you have to write the tree
traversal code / node interpretation code anyway. So why write the
parser as well, as you already get the expression tree handed to you?

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------

May 27 '07 #5

Frans Bouma [C# MVP]

Marc Gravell wrote:

Jon has correctly interpreted my witterings... my point being, that
unlike raw .Net 2.0 predicates (etc), LINQ expression trees are quite
structured. Obviously the C# 3-series compiler can parse source to
create an expression, but it is unclear to me (at least until
Reflector supports CLR 3.5 ;-p) how much of this is the compiler and
how much is the runtime.

it's my understanding that there are two routes:
1) the source the query will work on implements IEnumerable (e.g.
List<T>) This will make the compiler emit calls to extension methods.
2) the source the query will work on implements IQueryable. This will
make the compiler emit code which builds an expression tree. (correct
me if I'm wrong, this is what I understood of it).

So, the code executed at runtime, will build the expression tree for
you and hand it to the source object the query works on (in situation
2). From there, the source object is on its own and can do with the
expression tree what it wants.

This also means that errors in the tree could lead to
runtime-exceptions, not compile-time exceptions. For example the
reference to a method which isn't available according to the
sourceobject.

Now, what it also means is that you don't need a string parser, as
that would lead to a tree similar to the expression tree and you then
have to write the tree intepreter as well, so you win nothing from
using the string parser. There's one exception: expression trees aren't
serializable IIRC, so to serialize them, you could opt for text,
however references to properties/methods are hard to re-build I think.

The ability to parse a string to an expression would be very
powerful... although there would obviously be some limitations in
terms if resolving external entities... but heck: just access to the
expression arguments and literals would be pretty powerful.

Parse a string to an expression is what every parser does ;). You
already have that build into the C# compiler (v3.5), so why re-do that?

FB

Maybe I'm just "off on one"...

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------

May 27 '07 #6

Jon Skeet [C# MVP]

Frans Bouma [C# MVP] <pe******************@xs4all.nlwrote:

<snip>

The ability to parse a string to an expression would be very
powerful... although there would obviously be some limitations in
terms if resolving external entities... but heck: just access to the
expression arguments and literals would be pretty powerful.

Parse a string to an expression is what every parser does ;). You
already have that build into the C# compiler (v3.5), so why re-do that?

Because you might want to do it at execution time. Suppose the queries
are stored in a configuration file - you don't really want to have to
build valid C# to give it to the C# compiler, if there's a parser which
deals specifically with LINQ queries.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

May 27 '07 #7

Marc Gravell

Maybe I'm not explaining myself well...
My point being; the creation of expression trees is not solely the
domain of the compiler (you can create them at runtime through code)
[equally, neither is their execution, hence D-LINQ, object-LINQ, etc).

Occasionally in a system it is necessary to determine some factors at
runtime, for instance through configuration. Rather than write yet
another parser, it would seem highly advantageous if LINQ allowed us a
common supported expression parser. I can think of many uses of such.

It doesn't sound as though anybody is aware of anything... maybe I'll
see what I can cobble together with expression trees. Pity.

Marc

May 27 '07 #8

Frans Bouma [C# MVP]

Jon Skeet [C# MVP] wrote:

Frans Bouma [C# MVP] <pe******************@xs4all.nlwrote:

<snip>

The ability to parse a string to an expression would be very
powerful... although there would obviously be some limitations in
terms if resolving external entities... but heck: just access to
the expression arguments and literals would be pretty powerful.
Parse a string to an expression is what every parser does ;). You
already have that build into the C# compiler (v3.5), so why re-do
that?

Because you might want to do it at execution time. Suppose the
queries are stored in a configuration file - you don't really want to
have to build valid C# to give it to the C# compiler, if there's a
parser which deals specifically with LINQ queries.

If the queries are stored in a config file, they're not Linq queries,
IMHO. So if the proposal is: <text in some format-Expression trees,
sure, that can be helpful, but only if the expression tree is then
usefully interpreted by the interpreter you want to use.

The expression tree interpreter and the actual class/object which
executes the interpretation result are very tightly connected. This
means that you can't have an expression tree parser without a class
which does something with the interpretation of the tree. The tree
isn't the goal, it's an intermediate format between Linq query in VB/C#
and a series of commands to execute.

IMHO, it's only useful to store expression trees as text somewhere if
you can't have the C# code somewhere AND the series of commands you
want to execute is produced by a class which only eats expression
trees. Any other situation doesn't justify the route of expression
trees IMHO.

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------

May 28 '07 #9

Frans Bouma [C# MVP]

Marc Gravell wrote:

Maybe I'm not explaining myself well...
My point being; the creation of expression trees is not solely the
domain of the compiler (you can create them at runtime through code)
[equally, neither is their execution, hence D-LINQ, object-LINQ, etc).

Occasionally in a system it is necessary to determine some factors at
runtime, for instance through configuration. Rather than write yet
another parser, it would seem highly advantageous if LINQ allowed us a
common supported expression parser. I can think of many uses of such.

What's needed is solid documentation how an expression tree looks
like when given Linq elements are expressed in C#/VB. A generic
expression parser is actually not that useful. All it can do is tell
you what kind of node is found at a given position: it doesn't do
anything for you. The tough element is to DO something useful AFTER a
given node or treebranch has been examined and understood.

Interpreting a string similar to interpreting an expression tree is
equal to interpreting the expression tree. At least if you're using an
LL(n) parser. An LR(n) parser shifts/reduces the parse-tree while
evaluating it AND building it so you can best describe the expression
tree as a parse result of an LL(n) parse operation of input text. The
tree is then handed over to the engine which consumes the tree for
doing something with it, be it emitting code, constructing commands etc.

Though, you only need to go that route if you work with the end part
of the sequence just described, i.e. the engine which consumes an
expressiontree. Then you need to feed it an expression tree to produce
something.

If you then need to build queries at runtime, you indeed are out of
luck UNLESS you build the expression tree manually.

That's also why I don't understand why you want to use strings which
are outputted from the expression tree in the first place: you already
have the tree! Not only that, going from strings to expression tree
with a parser effectively re-implements what's already in the C#
compiler. I really fail to see why the effort is needed or even
'powerful' or useful. Please give me a use-case scenario, as things
from configuration is too vague: what are you going to do with the data
from the config file? Why is the strings -expression tree the only
real solution to your situation?

Besides that, generating a piece of C# with the linq queries, compile
it and run it at runtime takes 100 lines of code max.

It doesn't sound as though anybody is aware of anything... maybe I'll
see what I can cobble together with expression trees. Pity.

There are some blank spots here and there, but the overall system
isn't that obscure.

What bugs me is that I still have no clue what you're trying to
achieve.

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------

May 28 '07 #10

Jon Skeet [C# MVP]

Frans Bouma [C# MVP] <pe******************@xs4all.nlwrote:

<snip>

That's also why I don't understand why you want to use strings which
are outputted from the expression tree in the first place: you already
have the tree! Not only that, going from strings to expression tree
with a parser effectively re-implements what's already in the C#
compiler. I really fail to see why the effort is needed or even
'powerful' or useful. Please give me a use-case scenario, as things
from configuration is too vague: what are you going to do with the data
from the config file? Why is the strings -expression tree the only
real solution to your situation?

It doesn't need to be the *only* real solution to make it useful - it
just needs to be the *best* one in some way.

Here's an example from a similar system. I was on a project which could
generate reports, and to add a new report you simply provided some data
which included a Hibernate query in HQL form. A similar system based on
LINQ could use a LINQ query, even if it's tied to LINQ to SQL.

Besides that, generating a piece of C# with the linq queries, compile
it and run it at runtime takes 100 lines of code max.

That's a really ugly solution though, IMO.

<snip>

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

May 28 '07 #11

Barry Kelly

Frans Bouma [C# MVP] wrote:

Marc Gravell wrote:

Maybe I'm not explaining myself well...
My point being; the creation of expression trees is not solely the
domain of the compiler (you can create them at runtime through code)
[equally, neither is their execution, hence D-LINQ, object-LINQ, etc).

Occasionally in a system it is necessary to determine some factors at
runtime, for instance through configuration. Rather than write yet
another parser, it would seem highly advantageous if LINQ allowed us a
common supported expression parser. I can think of many uses of such.

What's needed is solid documentation how an expression tree looks
like when given Linq elements are expressed in C#/VB.

This is basically describing the semantics for a given parse. It's not
too difficult, and for a library parser, if there was one, it could
follow a pre-existing syntax (such as C#). Of course, there would be
issues - it would need to do the same type inference etc. that C# does.

A generic
expression parser is actually not that useful. All it can do is tell
you what kind of node is found at a given position: it doesn't do
anything for you.

Uh, that's exactly what an expression parser *is* useful for: turning
text into a tree. And if the tree matches what the C# Linq expression
parsing does, then the language is the same as the one defined by C#.

The tough element is to DO something useful AFTER a
given node or treebranch has been examined and understood.

Uh, no. The whole point of the discussion is the turning of text into an
Expression tree. Once you've done that, you're in the same position as
you would have been with writing a C# method taking an
Expression<Func<>>. You can call Compile() on it if you like, or
otherwise hand it off to your database engine - and do this at run time,
not compile time, as is the constraint with text in C# (and see later
for a counterpoint to your "100 lines" dismissal).

The interesting bit (for the poster, as far as I can see) is the bit
*before* this, while you seem fixated on the bit *after* this. Both are
interesting; but you seem to be dismissing one out of hand.

Interpreting a string similar to interpreting an expression tree is
equal to interpreting the expression tree. At least if you're using an
LL(n) parser. An LR(n) parser shifts/reduces the parse-tree while
evaluating it AND building it so you can best describe the expression
tree as a parse result of an LL(n) parse operation of input text. The
tree is then handed over to the engine which consumes the tree for
doing something with it, be it emitting code, constructing commands etc.

Though, you only need to go that route if you work with the end part
of the sequence just described, i.e. the engine which consumes an
expressiontree. Then you need to feed it an expression tree to produce
something.

Yes, yes, writing a recursive descent LL(1) expression parser limited to
a subset of C# expressions is fairly trivial. I'm not sure what point
you're trying to make here.

If you then need to build queries at runtime, you indeed are out of
luck UNLESS you build the expression tree manually.

I don't know how you reach this conclusion. A parser builds the
expression tree automatically - that's the whole point, so that you
don't need to put together your expression tree manually to build a
query at runtime.

That's also why I don't understand why you want to use strings which
are outputted from the expression tree in the first place: you already
have the tree!

I think you've got this backward; someone simply pointed out that
ToString() on an expression tree printed out text in what appeared to be
a canonical language for .NET expression trees. The natural question
then is: is there a parser for this language in the .NET 3.5 framework?
(There doesn't appear to be.)

It isn't that someone is going to try and round-trip their tree through
ToString() and back again with a parser. That would be completely
useless, and attacking that is attacking a complete straw man. The
reference to ToString() is simply pointing out that there *is* a
language implicitly defined by the result of this method, so the natural
question is - where is the parser for this language?

Not only that, going from strings to expression tree
with a parser effectively re-implements what's already in the C#
compiler.

Yes, but the C# compiler is rather heavyweight and clumsy, lacking
context (definitions that exist at runtime and not necessarily in any
static assembly) and nice error reporting, and resulting in compiled
assemblies on disk - not nearly as light-weight as things like
Expression trees and DynamicMethods. Calling out to C# will end up in
producing an assembly, which you're either going to have to load (and
never release) in the current AppDomain, or awkwardly and indirectly
load at arms-length in another AppDomain, all to get a simple Expression
tree that should be available with a simple 'Expression Parse(string
text);' method.

I really fail to see why the effort is needed or even
'powerful' or useful.

Please give me a use-case scenario, as things
from configuration is too vague: what are you going to do with the data
from the config file? Why is the strings -expression tree the only
real solution to your situation?

I once worked on a system that used a lightweight databinding language
for evaluating GUI properties, business properties and data-level
constraints (among other things), where the source texts of this
language were stored in a metadata system. Each individual piece of
source code averaged no more than 30 characters long or so.
Collectively, there were more than 10,000 snippets of this source code
in a medium-size financial services (insurance) data entry application.
The system compiled these snippets at runtime, and could load new
metadata as the system was running, and transfer over to it, without
rebooting. Metadata could churn indefinitely.

Using the C# compiler for this would have been deeply painful and slow.

Besides that, generating a piece of C# with the linq queries, compile
it and run it at runtime takes 100 lines of code max.

10000 snippets of code = 10000 assemblies that need loading. Sure, there
are workarounds via AppDomains and bunching texts etc., but then you've
moved into far more complex territory than what ought to be a simple
one-liner - 'Expression Parse(string)'.

It doesn't sound as though anybody is aware of anything... maybe I'll
see what I can cobble together with expression trees. Pity.

There are some blank spots here and there, but the overall system
isn't that obscure.

What bugs me is that I still have no clue what you're trying to
achieve.

Hopefully I've helped - the need is very, very clear to me. But then I
write compilers for a living.

-- Barry

--
http://barrkel.blogspot.com/

May 28 '07 #12

Jon Skeet [C# MVP]

Barry Kelly <ba***********@gmail.comwrote:

<big snip>

What bugs me is that I still have no clue what you're trying to
achieve.

Hopefully I've helped - the need is very, very clear to me. But then I
write compilers for a living.

I haven't yet looked at what ToString() produces. Given your compiler
experience and the enthusiasm of a few us from the group, any reason we
shouldn't produce our own decent open source parser?

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

May 28 '07 #13

Barry Kelly

Jon Skeet wrote:

Barry Kelly <ba***********@gmail.comwrote:

<big snip>

What bugs me is that I still have no clue what you're trying to
achieve.
Hopefully I've helped - the need is very, very clear to me. But then I
write compilers for a living.

I haven't yet looked at what ToString() produces. Given your compiler
experience and the enthusiasm of a few us from the group, any reason we
shouldn't produce our own decent open source parser?

It's possible. I've only had a couple of hours to poke around in the
Orcas Beta 1 VM, and what I've seen of ToString()'s results is that it
includes invalid identifiers, presumably created by csc.exe to avoid
possibility of clashes.

C# expression trees don't include the full language. Using a statement
lambda such as 'x ={ return x; }' can't be used as an expression
lambda. That makes life a lot easier for the parser writer.

There are questions about how extensible / dynamic the symbol table
stuff needs to be (scoping, name resolution), though one could just
cheat and have a callback for that, defaulting to something sensible.

I also haven't looked deeply into the type inferencing that would be
required. C#'s model of type inferencing is peculiar: lambdas are
untyped until they are assigned, and at that point type information
binding sort of flows down the tree from the root, rather than up the
tree from the leaves, as it more usually does. Eric Lippert's blog
indicates that the general algorithm is exponential in worst cases,
IIRC.

-- Barry

--
http://barrkel.blogspot.com/

May 28 '07 #14

Jon Skeet [C# MVP]

Barry Kelly <ba***********@gmail.comwrote:

I haven't yet looked at what ToString() produces. Given your compiler
experience and the enthusiasm of a few us from the group, any reason we
shouldn't produce our own decent open source parser?

It's possible. I've only had a couple of hours to poke around in the
Orcas Beta 1 VM, and what I've seen of ToString()'s results is that it
includes invalid identifiers, presumably created by csc.exe to avoid
possibility of clashes.

C# expression trees don't include the full language. Using a statement
lambda such as 'x ={ return x; }' can't be used as an expression
lambda. That makes life a lot easier for the parser writer.

Indeed - for the C# parser, at least. Goodness knows what's feasible as
an expression tree itself though.

There are questions about how extensible / dynamic the symbol table
stuff needs to be (scoping, name resolution), though one could just
cheat and have a callback for that, defaulting to something sensible.

I also haven't looked deeply into the type inferencing that would be
required. C#'s model of type inferencing is peculiar: lambdas are
untyped until they are assigned, and at that point type information
binding sort of flows down the tree from the root, rather than up the
tree from the leaves, as it more usually does. Eric Lippert's blog
indicates that the general algorithm is exponential in worst cases,
IIRC.

Yes, I seem to remember that could get pretty horrendous. For some
reason I was hoping that it would be simpler for an expression tree - I
need to start looking at the whole area in a lot more detail soon.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

May 28 '07 #15

Marc Gravell

** STOP PRESS **

Haven't finished looking, but after a bitof a hunt, the answer may be
already there in the "Dynamic Expression API"; in particula,
System.Linq.Dynamic.DynamicExpression has Parse() and ParseLambda()

These are described in the Linq source (LinqSamples-2007_4.18), from
here: http://msdn2.microsoft.com/en-us/bb330936.aspx, looking in the
"DynamicQuery" folder.

I don't have my Beta-1 machine to-hand, so I will have to look at this
more tomorrow, but it looks very interesting!

Marc

May 28 '07 #16

Marc Gravell

Sweet ;-p Curious that it isn't part of the core framework, but rather
is a code-sample!

Marc

using System;
using System.Linq;
using System.Linq.Dynamic;
using System.Linq.Expressions;
using System.Collections.Generic;

class Program
{
static void Main(string[] args)
{
string filterFromDatabase = @"Value1 % 3 == 0";

IQueryable<DataItemqdi = null;
if (qdi != null)
{ // just to illustracte syntax
qdi.Where(filterFromDatabase);
}

// parse a string to an expression
Expression<Func<DataItem, bool>e =
DynamicExpression.ParseLambda<DataItem, bool>(
filterFromDatabase);

// compile to a Func
Func<DataItem, boolf = e.Compile();

// test some data
List<DataItemtestData = new List<DataItem>();
for (int i = 0; i < 25; i++)
{
testData.Add(new DataItem(i));
}
foreach (DataItem di in testData.Where(f))
{
Console.WriteLine(di.Value1);
}
}
}

class DataItem
{
private int value1;
public int Value1 { get { return value1; } set { value1 =
value; } }

public DataItem() { }
public DataItem(int value1)
{
Value1 = value1;
}
}

May 28 '07 #17

Frans Bouma [C# MVP]

Barry Kelly wrote:

Frans Bouma [C# MVP] wrote:

A generic
expression parser is actually not that useful. All it can do is tell
you what kind of node is found at a given position: it doesn't do
anything for you.

Uh, that's exactly what an expression parser is useful for: turning
text into a tree. And if the tree matches what the C# Linq expression
parsing does, then the language is the same as the one defined by C#.

Though, then you're re-doing what's already in C#. The thing is,
creating the tree isn't trivial either. It's not a simple parser which
builds a tree, as the query very likely refers to elements used IN the
code.

If this quest for a parser is to get dynamic queries working (queries
build up at runtime in a search for for example where based on
selections of the user various query fragments are added), there are
workarounds for this using C# without the necessity of looking at the
expression tree.

The tough element is to DO something useful AFTER a
given node or treebranch has been examined and understood.

Uh, no. The whole point of the discussion is the turning of text into
an Expression tree. Once you've done that, you're in the same
position as you would have been with writing a C# method taking an
Expression<Func<>>. You can call Compile() on it if you like, or
otherwise hand it off to your database engine - and do this at run
time, not compile time, as is the constraint with text in C# (and see
later for a counterpoint to your "100 lines" dismissal).

You can generate C# code using a template, compile it and run it in
less than 100 lines. This means that if you formulate your queries C#
code and store them as text in a file, you can load them one by one,
generate C# code at runtime at startup, which wraps these queries with
a method so you get back a IQueryable and you can use them at runtime.
No need for a parser and this isn't slow either.

The interesting bit (for the poster, as far as I can see) is the bit
*before* this, while you seem fixated on the bit after this. Both are
interesting; but you seem to be dismissing one out of hand.

Well, I've written a couple of parsers in the last few years (one
lr(n) parser generator and a couple of ll(n) ones) and although it's
interesting to write a parser from a geek standpoint, I also learned
it's also a waste of time if you can avoid it and leverage code which
is already there. Furthermore, the reason why expression trees aren't
serializable is IMHO also the reason why it's very hard to build a
USABLE expression tree from an external source which has to be working
with elements in the code you're executing.

IMHO the thing is: Linq queries aren't a separated entity in the code
which embeds them: they are connected with the code which embeds them.
This means that although you can separate them out in theory, in
practise this won't lead to anything, simply because the interesting
thing isn't the expression nor the tree, it's the consumer of the
expression tree. That consumer can't work with a tree which is created
in a void, it has to work with a tree which connects to the code
executing or at least be able to. (filter on a value of a variable for
example).

The question asked if there is a parser for a linq expression in
string format, is nice, but what I wonder is what's the bottom line
reason why this parser is absolutely needed? These strings don't fall
out of the sky and with a tree alone you don't get very far so there's
apparently an expression tree consuming object which has to be fed by
queries which can't be created by code... I then would like to know
what that scenario is. I saw people mentioning 'configuration', but I
don't understand what that has to do with things. Unless people want to
write out their linq queries in string form in a file (then you really
must hate your life), I fail to see why it can be of any value (and
even then).

If you then need to build queries at runtime, you indeed are out of
luck UNLESS you build the expression tree manually.

I don't know how you reach this conclusion. A parser builds the
expression tree automatically - that's the whole point, so that you
don't need to put together your expression tree manually to build a
query at runtime.

why would you want to build a query at runtime which can't be created
through code? It's not the parser->parse tree conversion, it's the
parser -parse tree -Linq expression tree conversion that's the
trouble: the linq expression tree has to refer to elements in the code,
or at least has to be able to. They can't serialize it to disk in any
format because of this, so re-building it at runtime would be a problem
as well.

My point is two fold:
- just because something is cool, doesnt make it worth looking into.
This is a general thing but it blurrs what's really important.
- the long road from expression tree to string to expression tree is
IMHO odd and not doable as it IS serialization/deserialization of the
tree, which isn't possible due to the nature of the tree and also IMHO
silly as you should simply create general code which builds the
expression tree with Linq based on the input. That would solve you from
having to write a parser and also would make you avoid the misery how
to tie up the expression tree with the embedding code.

But maybe I miss a very obvious use-case of string-based queries
here... String based queries in general suck bigtime (checking of
validity, maintainability), unless they're checked at compile time, and
any way to avoid them is prefered. The only way it works is when the
queries are checked at compile time so you can maintain them properly
and won't run into surprises at runtime.

That's also why I don't understand why you want to use strings
which are outputted from the expression tree in the first place:
you already have the tree!

I think you've got this backward; someone simply pointed out that
ToString() on an expression tree printed out text in what appeared to
be a canonical language for .NET expression trees. The natural
question then is: is there a parser for this language in the .NET 3.5
framework? (There doesn't appear to be.)

It isn't that someone is going to try and round-trip their tree
through ToString() and back again with a parser. That would be
completely useless, and attacking that is attacking a complete straw
man. The reference to ToString() is simply pointing out that there is
a language implicitly defined by the result of this method, so the
natural question is - where is the parser for this language?

err, why would someone need a parser for that if it ISN'T about
roundtripping? My whole understanding about needing the parser is for
roundtripping of the tostring output back to an expression tree, which
as you say, is useless.

If it's about 'I need to write my queries as strings', it's also
useless: queries which aren't checked at compile time (or at least
partly checked) are time consuming and hard to maintain.

Sure, if it was possible to have a DSL in some string format which is
embeddable at runtime into the running IL and could work together with
that IL, why not. Unfortunately, that's not the case, there's no
context switcher available for you, so both sides (C# and DSL) have to
know of one another and how to interact with the other. I don't think
that's doable with a simple parser which builds an expression tree from
a query string, because you don't have a symbol table at hand of the C#
code creating the tree.

Not only that, going from strings to expression tree
with a parser effectively re-implements what's already in the C#
compiler.

Yes, but the C# compiler is rather heavyweight and clumsy, lacking
context (definitions that exist at runtime and not necessarily in any
static assembly) and nice error reporting, and resulting in compiled
assemblies on disk - not nearly as light-weight as things like
Expression trees and DynamicMethods. Calling out to C# will end up in
producing an assembly, which you're either going to have to load (and
never release) in the current AppDomain, or awkwardly and indirectly
load at arms-length in another AppDomain, all to get a simple
Expression tree that should be available with a simple 'Expression
Parse(string text);' method.

For a single string I wouldn't go for this but then I also wouldn't
write a complete parser for this. For a lot of strings (think iBatis
with linq queries) you can do this rather easily with an in-memory
assembly compiled at runtime. Sure that's part of your appdomain but
the queries are needed at runtime in your app so keeping the assembly
around isn't a big deal.

I really fail to see why the effort is needed or even
'powerful' or useful.

Please give me a use-case scenario, as things
from configuration is too vague: what are you going to do with the
data from the config file? Why is the strings -expression tree
the only real solution to your situation?

I once worked on a system that used a lightweight databinding language
for evaluating GUI properties, business properties and data-level
constraints (among other things), where the source texts of this
language were stored in a metadata system. Each individual piece of
source code averaged no more than 30 characters long or so.
Collectively, there were more than 10,000 snippets of this source code
in a medium-size financial services (insurance) data entry
application. The system compiled these snippets at runtime, and
could load new metadata as the system was running, and transfer over
to it, without rebooting. Metadata could churn indefinitely.

Using the C# compiler for this would have been deeply painful and
slow.

the C# compiler is just used to compile the code. In the end IMHO it
doesnt' make a difference as the text has to be parsed anyway, which
IMHO is done pretty quickly by the C# compiler.

That aside, what's the real problem here I think is that a DSL is
tried to be embedded into C# at runtime, isn't this what's going on
here?

In _THAT_ particular situation, in theory I'm all for it. The sad
thing is though, there's no context switching done for you when you
move from C# space to the DSL space and back (the DSL here being the
expression tree as string).

Your example is also about a DSL which is embed in C# code and likely
a separate governing system is used to make the switch between C# and
DSL and back so you can refer to elements in the C# code in your DSL
and get results back from your DSL in C# code.

So to make this work for expression strings, this governing system has
to be written as well, and therefore your C# code has to be prepared
for it, as it is IMHO a tough struggle to get random expression strings
from disk and make them be able to refer to the elements in the
embedding code and vice versa. Because _THAT_'s what's the problem.
Parsing a string isn't that hard, it's what you want to do with the
tree you get after the parse, as it's IMHO tough to do anything with it
if it can't refer to anything on the outside. Some queries fit that
description, but often they do not.

Besides that, generating a piece of C# with the linq queries,
compile it and run it at runtime takes 100 lines of code max.

10000 snippets of code = 10000 assemblies that need loading. Sure,
there are workarounds via AppDomains and bunching texts etc., but
then you've moved into far more complex territory than what ought to
be a simple one-liner - 'Expression Parse(string)'.

well, reading the 10000 snippets is likely taking some time but you've
to do that anyway. You need 2 steps if you want to do this templated:
first load the template, then compile the template into an assembly (or
do this once and re-use the assembly), run the template which consumes
the snippets and generate C# code in-memory, then compile that C# code
and you have an assembly with classes with methods you can call to get
expression trees from. Complex? Not really, it's very easy to get this
running. Most code of these kind of code-generators is about formatting
output. That's not an issue here so the code you've to write is very
simple.

It's to illustrate that there is a way to do this already, and you can
use C# or VB.NET for the strings. The fun part of this approach is also
that you can make the template (which is also written in C#) be more
clever and for example emit methods which take parameters so you CAN
link embedding code with the linq queries (filter all orders on the id
of the customer which is selected in the grid).

A one-liner 'Expression.Parse(string)' looks great, but it's not going
to work. Where do you fix up the tree so it refers to objects,
properties and variables currently in scope?

It doesn't sound as though anybody is aware of anything... maybe
I'll see what I can cobble together with expression trees. Pity.
There are some blank spots here and there, but the overall system
isn't that obscure.

What bugs me is that I still have no clue what you're trying to
achieve.

Hopefully I've helped - the need is very, very clear to me. But then I
write compilers for a living.

As a person who writes query consuming code for a living, I do
understand the necessity for dynamic queries build at runtime, but
using strings for that isn't the answer, as it's unmaintainable and
error prone.

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------

May 29 '07 #18

Frans Bouma [C# MVP]

Marc Gravell wrote:

Sweet ;-p Curious that it isn't part of the core framework, but rather
is a code-sample!

Marc

using System;
using System.Linq;
using System.Linq.Dynamic;
using System.Linq.Expressions;
using System.Collections.Generic;

class Program
{
static void Main(string[] args)
{
string filterFromDatabase = @"Value1 % 3 == 0";

IQueryable<DataItemqdi = null;
if (qdi != null)
{ // just to illustracte syntax
qdi.Where(filterFromDatabase);
}

// parse a string to an expression
Expression<Func<DataItem, bool>e =
DynamicExpression.ParseLambda<DataItem, bool>(
filterFromDatabase);

// compile to a Func
Func<DataItem, boolf = e.Compile();

// test some data
List<DataItemtestData = new List<DataItem>();
for (int i = 0; i < 25; i++)
{
testData.Add(new DataItem(i));
}
foreach (DataItem di in testData.Where(f))
{
Console.WriteLine(di.Value1);
}
}
}

The only way this can work IMHO is that they create propertydescriptor
instances upfront in the tree which refer to Value1, correct? (I dont'
have B1 available to me at this point so can't check in-memory with the
debugger).

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------

May 29 '07 #19

Frans Bouma [C# MVP]

Marc Gravell wrote:

Sweet ;-p Curious that it isn't part of the core framework, but rather
is a code-sample!

Marc

using System;
using System.Linq;
using System.Linq.Dynamic;
using System.Linq.Expressions;
using System.Collections.Generic;

class Program
{
static void Main(string[] args)
{
string filterFromDatabase = @"Value1 % 3 == 0";

btw, you mentioned expression trees. I was under the assumption you
refered to full queries, so with from, source, joins filters etc. etc.,
not just single predicates.

That's also why I replied the way I did. Single predicates can be
parsed at runtime indeed. The scope is limited though IMHO. Full
queries in textform is a total different ballgame.

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------

May 29 '07 #20

Marc Gravell

The only way this can work IMHO is that they create

propertydescriptor
instances upfront in the tree which refer to Value1, correct?

Pretty much... actually, it uses a PropertyInfo; I don't know enough
about LINQ (yet) to know whether this is the only way... there is also
an overload of Expression.Property() that accepts the name, but
presumably this would then be less bound. I haven't tried reading the
IL to see what is going on in each overload.

Marc

May 29 '07 #21

Frans Bouma [C# MVP]

Marc Gravell wrote:

The only way this can work IMHO is that they create
propertydescriptor instances upfront in the tree which refer to
Value1, correct?

Pretty much... actually, it uses a PropertyInfo; I don't know enough
about LINQ (yet) to know whether this is the only way... there is
also an overload of Expression.Property() that accepts the name, but
presumably this would then be less bound. I haven't tried reading the
IL to see what is going on in each overload.

Ok, so this is then indeed usable in Linq to Object scenario's where
property descriptors can be used to filter objects. Using a tree with
property info/descriptor objects to generate SQL is a different thing
IMHO as the descriptor of the field doesn't tell you to which element
it belongs to. So "CustomerID == 'CHOPS'" is a valid filter for Order
but also for Customer. Having both in a query gives a bit of a problem
to which element you refer: customer or order ;)

I haven't checked out the string output in detail for an expression
tree which represents a full 'from ... where select' query. Are you
looking for a parser of these kind of queries or just single predicate
-tree fragment ?

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------

May 29 '07 #22

Marc Gravell

The code in the sample doc also includes Select etc support, but my
main question really related to filters (partly due to trying to solve
a specific problem). Since this was (broadly) comparable to some of
the LINQ features, I thought I'd investigate what (if any) overlap
existed. In short, I was after a parser for a string filter, that
could spit out a delegate. I took a stab in the dark that LINQ might
offer this as a bi-product. I wasn't far off...

But the sample code may be of use/interest to other devs.

As for your other questions (on the "why")... In this instance, yes: I
did have a specific intent in mind - but I was also motivated by
finding out more about LINQ, what is possible, what isn't, etc. Surely
that understanding is key in knowing when a new technology might be
useful...? And LINQ seems such a change that the more understanding we
have in the community, the better.

May 29 '07 #23

Marc Gravell

See also another reply a few minutes ago.

err, why would someone need a parser for that if it ISN'T about
roundtripping?

How about cross-architecture? As an "off the cuff", how about a
web-service that allowed me to issue a LINQ expression (in a known
scope)? And if the caller isn't .Net, or doesn't have shared assembly
access?

All I'm sayinng is that:
* LINQ has rich support for qualified queries
* (unrelated) Sometimes, you need to express a query between layers,
systems, or architectures, and a textual form is the lowest common
denominator
* So wouldn't it be nice if LINQ could bridge this gap, using *an*
expression syntax (not necessarily C#), which could be parsed into
proper expression trees

As a person who writes query consuming code for a living, I do
understand the necessity for dynamic queries build at runtime, but
using strings for that isn't the answer

I'll remember that next time I'm writing SQL ;-p

Marc

May 29 '07 #24

Frans Bouma [C# MVP]

Marc Gravell wrote:

err, why would someone need a parser for that if it ISN'T about
roundtripping?

How about cross-architecture? As an "off the cuff", how about a
web-service that allowed me to issue a LINQ expression (in a known
scope)? And if the caller isn't .Net, or doesn't have shared assembly
access?

That's IMHO a bad way of doing SOA. The webservice should be autonome,
have its own interface (methods) and work on a high level of your
application stack. So you send it messages, not filters.

All I'm sayinng is that:
* LINQ has rich support for qualified queries
* (unrelated) Sometimes, you need to express a query between layers,
systems, or architectures, and a textual form is the lowest common
denominator * So wouldn't it be nice if LINQ could bridge this gap,
using an expression syntax (not necessarily C#), which could be
parsed into proper expression trees

That sounds great on paper, but as I said in the quote below: queries
in strings aren't maintainable and often break at runtime. You need a
typed, compile time checked query language which, if an error is
specified, breaks at compile time so you can fix it before you ship it
to the customer ;)

As a person who writes query consuming code for a living, I do
understand the necessity for dynamic queries build at runtime, but
using strings for that isn't the answer

I'll remember that next time I'm writing SQL ;-p

Marc, you're still using EXEC sp_executeSQL "query" ? ;)

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------

May 30 '07 #25

Marc Gravell

That's IMHO a bad way of doing SOA.

Yes - I did say it was "off the cuff"... but I'm still of the opinion
that string-based runtime queries still have a role to play in
some scenarios. It shouldn't be the default, but there seems
to be enough interest to at least not preclude it.

You need a typed, compile time checked query language which,
if an error is specified, breaks at compile time so you can fix
it before you ship it to the customer ;)

In a perfect world, yes. But inevitably some things can't be known
at compile-time. It happens.

Marc, you're still using EXEC sp_executeSQL "query" ? ;)

A bit OT, but if I had the need to write dynamic SQL *inside* the
database (not from C#), then yes: sp_executeSQL on a
parameterised nvarchar query would be the correct solution.

But equally - yes, sometimes it is necessary to build up an SQL
query manually (parameterised of course), beyond what most ORMs
will offer (e.g. a current project involves a fully-temporal database
implementation; I've not had much luck using ORM in such
scenarios - perhaps yours is better the others, I don't know).
OR-mappers (including D-LINQ) aren't the only code that can
write SQL.

Marc

May 30 '07 #26

Frans Bouma [C# MVP]

Marc Gravell wrote:

That's IMHO a bad way of doing SOA.

Yes - I did say it was "off the cuff"... but I'm still of the opinion
that string-based runtime queries still have a role to play in
some scenarios. It shouldn't be the default, but there seems
to be enough interest to at least not preclude it.

the only reason I heard in the past years which makes some sense (not
much, but some) is that a string based query language often looks
similar to SQL.

You need a typed, compile time checked query language which,
if an error is specified, breaks at compile time so you can fix
it before you ship it to the customer ;)

In a perfect world, yes. But inevitably some things can't be known
at compile-time. It happens.

I haven't ran into these situations where you couldn't construct the
object based query at runtime and had to revert to stored strings.

Marc, you're still using EXEC sp_executeSQL "query" ? ;)

A bit OT, but if I had the need to write dynamic SQL inside the
database (not from C#), then yes: sp_executeSQL on a
parameterised nvarchar query would be the correct solution.

But equally - yes, sometimes it is necessary to build up an SQL
query manually (parameterised of course), beyond what most ORMs
will offer (e.g. a current project involves a fully-temporal database
implementation; I've not had much luck using ORM in such
scenarios - perhaps yours is better the others, I don't know).
OR-mappers (including D-LINQ) aren't the only code that can
write SQL.

Oh, that's true. However reverting to plain SQL is often a maintenance
burden, but if it's unavoidable, it happens of course. Though I'd first
look into Views and after that SQL strings created in code which should
use the o/r core. What exactly do you mean with fully temporal database?

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------

May 30 '07 #27

Marc Gravell

What exactly do you mean with fully temporal database?

http://en.wikipedia.org/wiki/Temporal_database

But without the benefit or a temporal query language; regular SQL will
have to suffice...

May 30 '07 #28

Frans Bouma [C# MVP]

Marc Gravell wrote:

What exactly do you mean with fully temporal database?

http://en.wikipedia.org/wiki/Temporal_database

But without the benefit or a temporal query language; regular SQL will
have to suffice...

That's indeed a problem. There are relational model tricks to work
around this abit, but you always have to do some work indeed.

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------

May 31 '07 #29

Linq; expression parser?

Similar topics